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PREFACE 


This  document  describes  the  OoO  Standard  Transmission  Control  Protocol 
(TCP).  There  have  been  nine  earlier  editions  of  the  ARPA  TCP 
specification  on  which  this  standard  is  based,  and  the  present  text 
draws  heavily  from  them.  Thera  have  been  many  contributors  to  this  work 
both  in  terms  of  concepts  and  in  terms  of  text.  This  edition  clarifies 
several  details  and  removes  the  end-of-letter  buffer-size  adjustments, 
and  redescribes  the  letter  mechanism  as  a  push  function. 

Jon  Postel 

Editor 
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TRANSMISSION  CONTROL  PROTOCOL  § 

DARPA  INTERNET  PROGRAM 
PROTOCOL  SPECIFICATION 


1.  INTRODUCTION 

The  Transmission  Control  Protocol  (TCP)  is  intended  for  use  as  a  highly 
reliable  host-to-host  protocol  between  hosts  in  packet-switched  computer 
communication  networks,  and  in  interconnected  systems  of  such  networks. 

JThis  document  describes  the  functions  to  be  performed  by  the 
Transmission  Control  Protocol,  the  program  that  implements  it.  and  its 
interface  to  programs  or  users  that  require  its  services. 

A 

1,1.  Motivation 

Computer  communication  systems  are  playing  an  increasingly  important 
role  in  military,  government,  and  civilian  environments.  This 
document  focuses  its  attention  primarily  on  military  computer 
communication  requirements,  especially  robustness  in  the  presence  of 
communication  unreliability  and  availability  in  the  presence  of 
congestion,  but  many  of  these  problems  are  found  in  the  civilian  and 
government  sector  as  well. 

As  strategic  and  tactical  computer  communication  networks  are 
developed  and  deployed,  it  is  essential  to  provide  means  of 
interconnecting  them  and  to  provide  standard  interprocess 
communication  protocols  which  can  support  a  broad  range  of 
applications.  In  anticipation  of  the  need  for  such  standards,  the 
Deputy  Undersecretary  of  Defense  for  Research  and  Engineering  has 
declared  the  Transmission  Control  Protocol  (TCP)  described  herein  to 
be  a  basis  for  DoD-wide  inter-process  communication  protocol 
standardization . 

TCP  is  a  connection-oriented,  end-to-end  reliable  protocol  designed  to 
fit  into  a  layered  hierarchy  of  protocols  which  support  multi-network 
applications.  The  TCP  provides  for  reliable  inter-process 
communication  between  pairs  of  processes  in  host  computers  attached  to 
distinct  but  interconnected  computer  communication  networks.  Very  few 
assumptions  are  made  as  to  the  reliability  of  the  communication 
protocols  below  the  TCP  layer.  TCP  assumes  it  can  obtain  a  simple, 
potentially  unreliable  datagram  service  from  the  lower  level 
protocols.;  In  principle,  the  TCP  should  be  able  to  operate  above  a 
wide  spectrum  of  communication  systems  ranging  from  hard-wired 
connections  to  packet-switched  or  circuit-switched  networks. 
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TCP  is  based  on  concepts  first  described  by  Cerf  and  Kahn  in  [1].,  The 
TCP  fits  into  a  layered  protocol  architecture  just  above  a  basic 
Internet  Protocol  [2]  which  provides  a  way  for  the  TCP  to  send  and 
receive  variable-length  segments  of  information  enclosed  in  internet 
datagram  "envelopes".  The  internet  datagram  provides  a  means  for 
addressing  source  and  destination  TCPs  in  different  networks.  The 
internet  protocol  also  deals  with  any  fragmentation  or  reassembly  of 
the  TCP  segments  required  to  achieve  transport  and  delivery  through 
multiple  networks  and  interconnecting  gateways.  The  internet  protocol 
also  carries  information  on  the  precedence,  security  classification 
and  compartmentat ion  of  the  TCP  segments,  so  this  information  can  be 
communicated  end-to-end  across  multiple  networks. 

Protocol  Layering 


+ 


- - - - - + 

higher-level  | 


|  internet  protocol  | 

+ - - - - - + 

| communication  network! 


Figure  1 

Much  of  this  document  is  written  in  the  context,  of  TC?  implementations 
which  are  co-resident  with  higher  level  protocols  in  the  host 
computer.  Some  computer  systems  will  be  connected  to  networks  via 
front-end  computers  which  house  the  TCP  and  internet  protocol  layers, 
as  well  as  network  specific  software.  The  TCP  specification  describes 
an  interface  to  the  higher  level  protocols  which  appears  to  be 
irrplementable  even  for  the  front-end  case,  as  long  as  a  suitable 
host-to-f ront  end  protocol  is  implemented. 

1.2.  Scope 

The  TCP  is  intended  to  provide  a  reliable  process-to-process 
communication  service  in  a  multinetwork  environment.  The  TCP  is 
intended  to  be  a  host-to-host  protocol  in  common  use  in  multiple 
netwc  •'  s . 

1.3.  About  this  Oocument 

This  document  represents  a  specification  of  the  behavior  required  of 
any  TCP  implementation,  both  in  its  interactions  with  higher  level 
protocol :  and  in  its  interactions  with  other  TCPs.  The  rest  of  this 
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section  offers  a  very  brief  view  of  the  protocol  interfaces  and 
operation.  Section  2  summarizes  the  philosophical  basis  for  the  TCP 
design.  Section  3  offers  both  a  detailed  description  of  the  actions 
reouired  of  TCP  when  various  events  occur  (arrival  of  new  segments, 
user  calls,  errors,  etc.)  and  the  detail?  of  the  formats  of  TCP 
segments . 

1.4..  Interfaces 

The  TCP  interfaces  on  one  side  to  user  or  application  processes  and  on 
the  other  side  to  a  lower  level  protocol  such  as  Internet  protocol.; 

The  interface  between  an  application  process  and  the  TCP  is 
•illustrated  in  reasonable  detail.  This  interface  consists  of  a  set  of 
calls  much  like  the  calls  an  operating  system  provides  to  an 
application  process  for  manipulating  files.  For  example,  there  are 
calls  to  open  and  close  connections  and  to  send  and  receive  data  on 
established  connections.  It  is  also  expected  that  the  TCP  can 
asynchronously  communicate  with  application  programs.  Although 
considerable  freedom  is  permitted  to  TCP  implementors  to  design 
interfaces  which  are  appropriate  to  a  particular  operating  system 
environment,  a  minimum  functionality  is  required  at  the  TCP/user 
interface  for  any  valid  implementation. 

The  interface  between  TCP  and  lower  level  protocol  is  essentially 
unspecified  except  that  it  is  assumed  there  is  a  mechanism  whereby  the 
two  levels  can  asynchronously  pass  information  to  each  other.. 
Typically,  one  expects  the  lower  level  protocol  to  sDecifv  this 
interface.  TCP  is  designed  to  work  in  a  very  general  environment  of 
interconnected  networks.  The  lower  level  protocol  which  is  assumed 
throughout  this  document  is  the  Internet  Protrcol  [2]. 

1.5.  Operation 

As  noted  above,  the  primary  purpose  of  the  TCP  is  to  provide  reliable, 
securable  logical  circuit  or  connection  service  between  pairs  of 
processes.  To  provide  this  service  on  top  of  a  less  reliable  internet 
communication  system  requires  facilities  in  the  following  areas: 

3asic  Data  Transfer 
Rel iabil ity 
Flow  Control 
Multiplexing 
Connections 

Precedence  and  Security 

The  basic  operation  of  the  TCP  in  each  of  these  areas  is  described  in 
the  following  paragraphs.. 
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Basic  Data  Transfer: 

The  TCP  is  able  to  transfer  a  continuous  stream  of  octets  in  eacn 
direction  between  its  users  by  packaging  some  number  of  octets  into 
segments  for  transmission  through  the  internet  system.  In  general, 
tha  TCPs  decide  wnen  to  block  and  forward  data  at  their  own 
convenience. 

Somstimes  users  need  to  be  sure  that  all  tha  data  they  have 
submitted  to  the  TCP  has  been  transmitted.  For  this  purpose  a  push 
function  is  defined.  To  assure  that  data  submitted  to  a  TCP  is 
actually  transmitted  tha  sending  user  indicates  that  it  should  be 
pushed  through  to  the  receiving  user.  A  push  causes  the  TCPs  to 
promptly  forward  and  deliver  data  up  to  that  point  to  the  receiver. 
The  exact  push  point  might  not  be  visible  to  the  receiving  user  and 
the  push  function  does  not  supply  a  record  boundary  marker. 

Reliability: 

The  TCP  must  recover  from  data  that  is  damaged,  lost,  duplicated,  or 
delivered  out  of  order  by  the  internet  communication  system.  This 
is  achieved  cy  assigning  a  sequence  number  to  each  octet 
transmitted,  and  requiring  a  positive  acknowledgment  (ACK)  from  the 
receiving  TCP.  If  the  ACK  is  not  received  within  a  timeout 
interval,  the  data  is  retransmitted.  At  the  receiver,  the  sequence 
numbers  are  used  to  correctly  order  segments  that  may  be  received 
out  of  order  and  to  eliminate  duplicates.  Damage  is  handled  by 
adding  a  checksum  to  each  segment  transmitted,  checking  it  at  the 
receiver,  and  discaroing  damaged  segments. 

As  long  as  the  TCPs  continue  to  function  properly  and  the  internet 
system  does  not  become  completely  partitioned,  no  transmission 
errors  will  affect  the  correct  delivery  of  data.  TCP  recovers  from 
internet  communication  system  errors. 

flow  Control.: 

TCP  provides  a  means  for  the  receiver  to  govern  the  amount  of  data 
sent  by  the  sender.  This  is  achieved  by  returning  a  "window'  with 
every  ACK  indicating  a  range  of  acceptable  sequence  numbers  beyond 
the  last  segment  successfully  received.  The  window  indicates  an 
allowed  number  of  octets  that  the  sender  may  transmit  before 
receiving  further  permission. 
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Multiplexing : 

To  allow  for  many  processes  within  a  single  Host  to  use  TCP 
communicat ion  facilities  simultaneously,  the  TCP  provides  a  set  of 
addresses  or  ports  within  each  host.  Concatenated  with  the  networic 
and  host  addresses  from  the  internet  communication  layer,  this  forms 
a  socket.  A  pair  of  sockets  uniquely  identifies  each  connection.. 

That  is.  a  socket  may  be  simultaneously  used  in  multiple 
connections. 

The  binding  of  ports  to  processes  is  handled  independently  by  each 
Host.  However,  it  proves  useful  to  attach  frequently  used  processes 
(e.g...  a  "logger"  or  timesharing  service)  to  fixed  sockets  wnich  are 
made  known  to  tne  public.  These  services  can  then  be  accessed 
through  the  known  addresses.  Establishing  and  learning  the  port 
addresses  of  other  processes  may  involve  more  dynamic  mechanisms. 

Connections 

The  reliability  and  flow  control  mechanisms  described  above  require 
that  TCPs  initialize  and  maintain  certain  status  information  for 
each  data  stream.  The  combination  of  this  information,  including 
sockets,  sequence  numbers,  and  window  sizes,  is  called  a  connection. 
Each  connection  is  uniquely  specified  by  a  pair  of  sockets 
identifying  its  two  sides. 

When  two  processes  wish  to  communicate,  their  TCP's  must  first 
establish  a  connection  (initialize  the  status  information  on  each 
side).  When  their  communication  is  complete,  the  connection  is 
terminated  or  closed  to  free  the  resources  for  other  uses. 

Since  connections  must  be  established  between  unreliable  hosts  and 
over  the  unreliable  internet  communication  system,  a  handshake 
mechanism  with  clock-based  sequence  numbers  is  used  to  avoid 
erroneous  initialization  of  connections. 

Precedence  and  Security: 

The  users  of  TCP  may  indicate  the  security  and  precedence  of  their 
communication.  Provision  is  made  for  default  values  to  be  used  when 
these  features  are  not  needed. 
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2.  PHILOSOPHY 

2.1.  Elements  of  the  Internetwork  System 

The  internetwork  environment  consists  of  hosts  comected  to  networks 
which  are  m  turn  interconnected  via  gateways.  It  is  assumed  here 
that  the  networks  may  be  either  local  networks  (e.g.,  the  ETHERNET)  or 
large  networks  (e.g.,  the  ARPANET),  but  in  any  case  are  based  on 
packet  switching  technology.  The  active  agents  tnat  produce  and 
consume  messages  are  processes.  Various  levels  of  protocols  in  the 
networks,  the  gateways,  and  the  hosts  support  an  interprocess 
communicaticn  system  that  provides  two-way  data  flow  on  logical 
connections  between  process  ports. 

The  term  packet  is  used  generically  here  to  mean  the  data  of  one 
transaction  between  a  host  ana  its  network.  The  format  of  data  blocks 
exchanged  within  the  a  network  will  generally  not  be  of  concern  to  us. 

Hosts  are  computers  attached  to  a  network,  and  from  the  communication 
network's  point  of  view,  are  the  sources  and  destinations  of  packets. 
Processes  are  viewed  as  the  active  elements  in  host  computers  (in 
accordance  with  the  fairly  common  definition  of  a  process  as  a  program 
in  execution).  Even  terminals  and  files  or  other  I/O  devices  are 
viewed  as  communicating  with  each  other  through  the  use  of  processes. 
Thus,  all  communication  is  viewed  as  inter-process  communication. 

Since  a  process  may  need  t.o  distinguish  among  several  communication 
streams  between  itself  and  another  process  (cr  processes),  we  imagine 
that  each  process  may  have  a  number  of  ports  through  which  it 
communicates  with  the  ports  of  other  processes. 

2.2,  Model  of  Operation 

Processes  transmit  data  by  calling  on  the  TCP  and  passing  buffers  of 
data  as  arguments.  The  TCP  packages  the  data  from  these  buffers  into 
segments  and  calls  on  the  internet  moduls  to  transmit  each  segment  to 
the  destination  TCP.  The  receiving  TCP  places  the  data  *rom  a  segment 
into  the  receiving  user’s  buffer  and  notifies  the  receiving  user.  The 
TCPs  include  control  information  in  the  segments  which  they  use  to 
ensure  reliable  ordered  data  transmission. 

The  model  of  internet  ccmmuni cation  is  that  there  is  an  internet 
protocol  module  associated  with  each  TCP  which  provides  an  interface 
to  the  local  network.  This  internet  module  packages  TCP  segments 
inside  internet  datagrams  and  routes  these  datagrams  to  a  destination 
internet  module  or  intermediate  gateway.  To  transmit  the  datagram 
through  the  local  network,  it  is  embedded  in  a  local  network  packet. 

The  packet  switches  may  perform  further  packaging,  fragmentation,  or 
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other  operations  to  achieve  the  delivery  of  the  local  packet  to  the 
destination  internet  module. 

At.,  a  gateway  between  networks,  the  internet  datagram  is  "unwrapped” 
from  its  local  packet  and  examined  to  determine  through  which  network 
the  internet  datagram  should  travel  next.  The  internet  datagram  is 
then  "wrapped"  in  a  local  packet  suitable  to  the  next  network  and 
routed  to  the  next  gateway,  or  to  the  final  destination. 

A  gateway  is  permitted  to  break  up  an  internet  datagram  into  smaller 
internet  datagram  fragments  if  this  is  necessary  for  transmission 
through  the  next  network.  To  do  this,  the  gateway  produces  a  set  of 
internet  datagrams;  each  carrying  a  fragment.  Fragments  may  be 
further  broken  into  smaller  fragments  at  subsequent  gateways.  The 
internet  datagram  fragment  format  is  designed  so  that  the  destination 
internet  module  can  reassemble  fragments  into  internet  datagrams. 

A  destination  internet  module  unwraps  the  segment  from  the  datagram 
(after  reassembling  the  datagram,  if  necessary)  and  passes  it  to  the 
destination  TCP. 

This  simple  model  of  the  operation  glosses  over  many  details.  One 
important  feature  is  the  type  of  service.  This  provides  information 
to  the  gateway  (or  internet  module)  to  guide  it  in  selecting  the 
service  parameters  to  be  used  in  traversing  the  next  network. 

Included  in  the  type  of  service  information  is  the  precedence  of  the 
datagram.  Datagrams  may  also  carry  security  information  to  permit 
host  and  gateways  that  operate  in  multilevel  secure  environments  to 
properly  segregate  datagrams  for  security  considerations. 

2.3.  The  Host  Environment 

The  .'CP  is  assumed  to  be  a  module  in  an  operating  system.  The  users 
access  the  TCP  much  like  they  would  access  the  file  system.  The  TCP 
may  call  on  other  operating  system  functions,  for  example,  to  manage 
data  structures.  The  actual  interface  to  the  network  is  assumed  to  be 
controlled  by  a  device  driver  module.  The  TCP  does  not  call  on  the 
network  device  driver  directly,  but  rather  calls  on  the  internet 
datagram  protocol  module  which  may  in  turn  call  on  the  device  driver. 

The  mechanisms  of  TCP  do  not  prec’ude  implementation  of  the  TCP  in  a 
front-end  processor.  However,  in  such  an  implementation,  a 
host-to-f ront-end  protocol  must  provide  the  functionality  to  support 
the  type  of  TCP-user  interface  described  in  this  document. 
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The  TCP/user  interface  provides  for  calls  made  by  the  user  on  the  TCP 
to  OPEN  or  CLOSE  a  connection,  to  SEND  or  RECEIVE  data,  or  to  obtain 
STATUS  about  a  connection.  These  calls  are  like  other  calls  from  user 
programs  on  the  operating  system,  for  example,  the  calls  to  open,  read 
from,  and  close  a  file. 


The  TCP/internet  interface  provides  calls  to  send  and  receive 
datagrams  addressed  to  TCP  modules  in  hosts  anywhere  in  the  internet 
system.  These  calls  have  parameters  for  passing  the  address,  type  of 
service,  precedence,  security,  and  other  control  information. 


2.5.  Relation  to  Other  Protocols 

The  following  diagram  illustrates  the  place  of  the  TCP  in  the  protocol 
hierarchy: 


|Telnet|  |  FTP  |  |Voice| 

i  !  I 


+ - *  - 

I  TCP  I  |  RTP 


+ - + 


♦ - - t- 


I  i 


Internet  Protocol  &  ICMP 


+ - + 

|  Local  Network  Protocol  | 
+ — - - — - - + 


Application  Level 


Host  Level 


Gateway  Level 


Network  Level 


Protocol  Relationships 
Figure  2. 

It  is  expected  that  the  TCP  will  be  able  to  support  higher  level 
protocols  efficiently.  It  should  be  easy  to  interface  higher  level 
protocols  like  the  ARPANET  Telnet  or  AUTODIN  II  THP  to  the  TCP. 

2.6.  Reliable  Communication 

A  stream  of  data  sent  on  a  TCP  connection  is  delivered  reliably  and  in 
order  at  the  destination. 
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Transmission  is  made  reliable  via  the  use  of  sequence  numbers  and 
acknowledgments.  Conceptually,  each  octet  of  data  is  assigned  a 
sequence  number.  The  sequence  number  of  the  first  octet  of  data  in  a 
segment  is  transmitted  with  that  segment  and  is  called  the  segment 
sequence  number.  Segments  also  carry  an  acknowledgment  number  which 
is  the  sequence  number  of  the  next  expected  data  octet  of 
transmissions  in  the  reverse  direction.  When  the  TCP  transmits  a 
segment  containing  data,  it  puts  a  copy  on  a  retransmission  queue  and 
starts  a  timer;  when  the  acknowledgment  for  that  data  is  received,  the 
segment  is  deleted  from  the  queue.  If  the  acknowledgment  is  not 
received  before  the  timer  runs  out.  the  segment  is  retransmitted. 

An  acknowledgment  by  TCP  does  not  guarantee  that  the  data  has  been 
delivered  to  the  end  user  but  only  that  the  receiving  TCP  has  taken 
the  responsibility  to  do  so. 

To  govern  the  flow  of  data  between  TCPs.  a  flow  control  mechanism  is 
employed.  The  receiving  TCP  reports  a  "window"  to  the  sending  TCP. 

This  window  specifies  the  number  of  octets,  starting  with  the 
acknowledgment  number,  that  the  receiving  TCP  is  currently  prepared  to 
receive. 

2.7.  Connection  Establishment  and  Clearing 

To  identify  the  separata  data  streams  that  a  TCP  may  handle,  the  TCP 
provides  a  port  identifier..  Since  port  identifiers  are  selected 
independently  by  each  TCP  they  might  not  be  unique.  To  provide  for 
unique  addresses  within  each  TCP,  we  concatenate  an  internet  address 
identifying  the  TCP  with  a  port  identifier  to  create  a  socket  which 
will  be  unique  throughout  all  networks  connected  together. 

A  connection  is  fully  specified  by  the  pair  of  sockets  at  the  ends.  A 
local  socket  may  participate  in  many  connections  to  different  foreign 
sockets.  A  connection  can  be  used  to  carry  data  in  both  directions, 
that  is.  it  is  "full  duplex". 

TCPs  are  free  to  associate  ports  with  processes  however  they  choose. 
However,  several  basic  concepts  are  necessary  in  any  implementation. 
There  must  be  well-known  sockets  which  the  TCP  associates  only  with 
the  "appropriate"  processes  by  some  means.  We  envision  that  processes 
may  "own"  ports,  and  that  processes  can  initiate  connections  only  on 
the  ports  they  own.  (Means  for  implementing  ownership  is  a  local 
issue,  but  we  envision  a  Request  Port  user  command,  or  a  method  of 
uniquely  allocating  a  group  of  ports  to  a  given  process,  e.g..  by 
associating  the  high  order  bits  of  a  port  name  with  a  given  process.) 

A  connection  is  specified  in  the  OPEN  call  by  the  local  port  and 
foreign  socket  arguments.  In  return,  the  TCP  supplies  a  (short)  local 
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connection  name  by  which  the  user  refers  to  the  connection  in 
subsequent  calls.  There  are  several  things  that  must  be  remembered 
about  a  connection.  To  store  this  information  we  imagine  that  there 
is  a  data  structure  called  a  Transmission  Control  Block  (TCB).  One 
implementation  strategy  would  have  the  local  connection  name  be  a 
pointer  to  the  TCB  for  this  connection.  The  OPEN  call  also  specifies 
whetner  the  connection  establishment  is  to  be  actively  pursued,  or  to 
be  passively  waited  for. 

A  passive  OPEN  request  means  that  the  process  wants  to  accept  incoming 
connection  requests  rather  than  attempting  to  initiate  a  connection. 
Often  the  process  requesting  a  passive  OPEN  will  accept  a  connection 
request  from  any  caller.  In  this  case  a  foreign  socket  of  all  zeros 
is  used  to  denote  an  unspecified  socket.  Unspecified  foreign  sockets 
are  allowed  only  on  passive  OPENS. 

A  service  process  that  wished  to  provide  services  for  unknown  other 
processes  would  issue  a  passive  OPEN  request  with  an  unspecified 
foreign  socket.  Then  a  connection  could  be  made  with  any  process  that 
requested  a  connection  to  this  local  socket.  It  would  help  if  this 
local  socket  were  known  to  be  associated  with  this  service. 

Well-known  sockets  are  a  convenient  mechanism  for  a  priori  associating 
a  socket  address  with  a  standard  service.  For  instance,  the 
"Telnet-Server"  process  is  permanently  assigned  to  a  particular 
socket,  and  other  sockets  are  reserved  for  File  Transfer,  Remote  Job 
Entry.  Text  Generator.  Echoer,  and  Sink  processes  (the  last  three 
being  for  test  purposes)..  A  socket  address  might  ba  reserved  for 
access  to  a  "Look-Up"  service  which  would  return  the  specific  socket 
at  which  a  newly  created  service  would  be  provided.  The  concept  of  a 
well-known  socket  is  part  of  the  TCP  specification,  but  the  assignment 
of  sockets  to  services  is  outside  this  specification.  (See  [4].) 

Processes  can  issue  passive  OPENS  and  wait  for  matching  active  OPENs 
from  other  processes  and  be  informed  by  the  TCP  when  connections  have 
been  established.  Two  processes  which  issue  active  OPENs  to  each 
other  at. the  same  time  will  be  correctly  connected.  This  flexibility 
is  critical  for  the  support  of  distributed  computing  in  which 
components  act  asynchronously  with  respect  to  each  other. 

There  are  two  principal  cases  for  matching  the  sockets  in  the  local 
passive  OPENs  and  an  foreign  active  OPENs.  In  the  first  case,  the 
local  passive  OPENs  has  fully  specified  the  foreign  socket.  In  this 
case,  the  match  must  be  exact.  In  the  second  case,  the  local  passive 
OPENs  has  left  the  foreign  socket  unspecified.  In  this  case,  any 
foreign  socket  is  acceptable  as  long  as  the  local  sockets  match. 

Other  possibilities  include  partially  restricted  matches. 
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If  there  are  several  pending  passive  OPENS  (recorded  in  TCBs)  with  the 
same  local  socket,  an  foreign  active  OPEN  will  be  matched  to  a  TCB 
with  the  specific  foreign  socket  in  the  foreign  active  OPEN,  if  such  a 
TCB  exists,  before  selecting  a  TCB  with  an  unspecified  foreign  socket. 

The  procedures  to  establish  connections  utilize  the  synchronize  (SYN) 
control  flag  and  involves  an  exchange  of  three  messages.  This 
exchange  has  been  termed  a  three-way  hand  shake  [3]. 

A  connection  is  initiated  by  the  rendezvous  of  an  arriving  segment 
containing  a  SYN  and  a  waiting  TCB  entry  each  created  by  a  user  OPEN 
command.  The  matching  of  local  and  foreign  sockets  determines  when  a 
connection  nas  been  initiated.  The  connection  becomes  "established" 
when  sequence  numbers  have  been  synchronized  in  both  directions. 

The  clearing  of  a  connection  also  involves  the  exchange  of  segments, 
in  this  case  carrying  the  FIN  control  flag. 

2.8.  Data  Communication 

The  data  that  flows  on  a  connection  may  be  thought  of  as  a  stream  of 
octets.  The  sending  user  indicates  in  each  SEND  call  whether  the  data 
in  that  call  (and  any  proceeding  calls)  should  be  immediately  pushed 
through  to  the  receiving  user  by  the  setting  of  the  PUSH  flag. 

A  sending  TCP  is  allowed  to  collect  data  from  the  sending  user  and  to 
send  that  data  in  segments  at  its  own  convenience,  until  the  push 
function  is  signaled,  then  it  must  send  all  unsent  data.  When  a 
receiving  TCP  sees  the  PUSH  flag,  it  must  not  wait  for  more  data  from 
the  sending  TCP  before  passing  the  data  to  the  receiving  process. 

There  is  no  necessary  relationship  between  push  functions  and  segment 
boundaries.  The  data  in  any  particular  segment  may  be  the  result  of  a 
single  SEND  call,  in  whole  or  part,  or  of  multiple  SEND  calls. 

The  purpose  of  push  function  and  the  PUSH  flag  is  to  push  data  through 
from  the. sending  user  to  the  receiving  user.  It  does  not  provide  a 
record  service. 

There  is  a  coupling  between  the  push  function  and  the  use  of  buffers 
of  data  that  cross  the  TCP/use'*  interface.  Each  time  a  PUSH  flag  is 
associated  with  data  placed  into  the  receiving  user’s  buffer,  the 
buffer  is  returned  to  the  user  for  processing  even  if  the  buffer  is 
not  filled.  If  data  arrives  that  fills  the  user's  buffer  befo're  a 
PUSH  is  seen,  the  data  is  passed  to  the  user  in  buffer  size  units. 

TCP  also  provides  a  means  to  communicate  to  the  receiver  of  data  that 
at  some  point  further  along  in  the  data  stream  than  the  receiver  is 
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currently  reading  there  is  urgent  data.  TCP  does  not  attempt  to 
define  what  the  user  specifically  does  upon  being  notified  of  pendin') 
urgent  data,  but  the  general  notion  is  that  the  receiving  process  w-i* 
take  action  to  process  the  urgent  data  quickly. 

2.9.  Precedence  and  Security 

The  TCP  makes  use  of  the  internet  protocol  type  of  service  field  and 
security  option  to  provide  precedence  and  security  on  a  per  connection 
basis  to  TCP  users.  Not  all  TCP  modules  will  necessarily  function  ; i 
a  multilevel  secure  environment;  some  may  be  limited  to  unclassified 
use  only  and  others  may  operate  at  only  one  security  level  and 
compartment.  Consequently,  some  TCP  implementations  and  services  to 
users  may  be  limited  to  a  subset  of  the  multilevel  secure  case. 

TCP  modules  which  operate  in  a  multilevel  secure  environment  must 
properly  mark  outgoing  segments  with  the  security,  compartment,  and 
precedence.  Such  TCP  modules  must  also  provide  to  their  users  or 
higher  level  orotocols  such  as  Telnet  or  THP  an  interface  to  allow 
them  to  specify  the  desired  security  level,  compartment,  and 
precedence  of  connections. 

2.10.,  Robustness  Principle 

TCP  implementations  will  follow  a  general  principle  of  robustness:  be 
conservative  in  what  you  do,  be  liberal  in  what  you  accept  from 
others . 
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3.1.  Header  Format 

TCP  segments  are  sent  as  internet  datagrams.  The  Internet  Protocol 
header  carries  several  information  fields,  including  the  source  and 
destination  host  addresses  [2].  A  TCP  header  follows  the  internet 
header,  supplying  information  specific  to  the  TCP  protocol.  This 
division  allows  for  the  existence  of  host  level  protocols  other  than 
TCP. 

TCP  Header  Format 


0  12  3 

0123456789012345  6  789012345678*01 

4-4-4  -+-  +  -  +  -4-4-4-4-4-+-4— 4-4-4 — t— 4-4-+-+-+ — t—4-4~-» - K — I - I - ►-  + — f — (- 

|  Source  Port  |  Destination  Port  | 

--+-+-+-4-4-4  — ♦—  4-4-4  — 4-4-4— 4-+-  +  -  +  -+-+-4-+  — 4-+-+-+-4-+-+-+— 4-4-4 

|  Sequence  Number  j 

+-+-4~ +-+-+— ♦—  4-*-+-4- +-+-+— 4-+-+-+-+-4— 4-+-+- *-+-+-+- +-+-+-4-+— *. 

|  Acknowledgment  Number  j 

4-+-+-+-+-+-4-+-+-+-4- +-+—4-4-4— r-4— 4-+-+-+- 4— 4-+-+-+-+-4- +-+-4-4 

|  Data  |  |U|A|P!R|S!Fj  | 

I  Offsetj  Reserved  I R j C| S | Sj Y j I j  Window  | 

j  I  jG|K|H|T|N!N|  | 

4  — 4-+-+-4- 4-* 4  — +  -  +  -+— 4— 4— 4-4-4— 4— 4-4-+-+-+- +— 4-+-+-+-+-+-4- 4-4-4  — 4 

|  Checksum  J  Urgent  Pointer  | 

+-+-+-+-4 — 4 — 4-4-4— 4 — +-4-+-+-4-4-4-4-+-  +  -+-H — 4-+-+-+ — 4-4-+-+-+-+-+ 

1  Options  |  Padding  | 

■4--4—4-+-4-4-+-+-  +  -4-+-+-+—4-  +  -+-+-+-+-4-4-+-+-+-+-+- +  -  +  -  +  — 4— 4-4-+ 

|  data  | 

4-4—  +— 4  —  +  -  +  -4— -r— 4— 4— 4— 4-4-4- 4-4—  4-4-4  —  4— 4-4— +-  +  -+-+-4—  4— 4— 4-4— 4— 4 

TCP  Header  Format 

Note  that  one  tick  mark  represents  one  bit  position. 

Figure  3. 

Source  Port:  16  bits 
The  source  port  number. 

Destination  Port:  15  bits 


he  destination  port  number. 
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Sequence  Number:  32  bits 

The  sequence  number  of  tha  first  data  octet  in  tnis  segment  (except 
when  SYN  is  present).  If  SYN  is  present  the  sequence  number  is  the 
initial  sequence  number  { ISN )  and  the  first  data  octet  is  ISN+1. 

Acknowledgment  Number:  32  bits 

If  the  ACK  control  bit  is  set  this  *ield  contains  the  value  of  the 
next  sequence  number  the  sender  of  the  segment  is  expecting  to 
receive.  Once  a  connection  is  estaolished  this  is  always  sent 

Data  Offset:  4  bits 

The  number  of  32  bit  words  in  the  TCP  Header.  This  indicates  where 
the  data  begins.  The  TCP  header  (even  one  including  options)  is  an 
integral  number  of  32  bits  long. 

Reserved:  6  bits 


Reserved  for  future  use.  Must  be  zero. 

Control  Bits:  6  bits  (from  left  to  right): 

URG:  Urgent  Pointer  field  significant 

ACK:  Acknowledgment  field  significant 

PSH:  Fush  Function 

RST:  Reset  the  connection 

SYN:  Synchronize  sequence  numbers 

FIN:  No  more  data  from  sender 

Window:  16  bits 

The  number  of  data  octets  beginning  with  the  one  indicated  in  the 
acknowledgment  field  which  the  sender  of  this  segment  is  willing  to 
accept . 

Checksum:  16  bits 


The  checksum  field  is  the  16  bit  one's  complement  of  the  one's 
complement  sum  of  all  16  bit  words  in  the  header  and  text.  If  a 
segment  contains  an  odd  number  of  header  and  text  octets  to  be 
checksummed,  the  last  octet  is  padded  on  the  right  with  zeros  to 
form  a  16  bit  word  for  checksum  purposes.  The  pad  is  not 
transmitted  as  part  of  the  segment.  While  computing  the  checksum, 
the  checksum  field  itself  is  replaced  with  zeros. 

The  checksum  also  covers  a  96  bit  pseudo  header  conceptually 
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prefixed  to  the  TCP  header.  This  pseudo  header  contains  the  Source 
Address,  the  Destination  Address,  the  Protocol,  and  TCP  length. 

This  gives  the  TCP  protection  against  misrouted  segments.  This 
information  is  carried  in  the  Internet  Protocol  and  is  transferred 
across  the  TCP/Network  interface  in  the  arguments  or  results  of 
calls  by  the  TCP  on  the  IP. 


Source  Address 


1 


I 

+ - - — 

|  zero 
+ - 


Destination  Address 


j  PTCL  |  TCP  Length 


1 

+ 

! 


The  TCP  Length  is  the  TCP  header  length  plus  the  data  length  in 
octets  (this  is  not  an  explicitly  transmitted  quantity,  but  is 
computed),  and  it  does  not  count  the  12  octets  of  the  pseudo 
header. 


Urgent  Pointer:  16  bits 

Tnis  field  communicates  the  current  value  of  the  urgent  pointer  as  a 
positive  offset  from  the  sequence  number  in  this  segment.  The 
urgent  pointer  points  to  the  sequence  number  of  the  octet  following 
the  urgent  data.  This  field  is  only  be  interpreted  in  segments  with 
the  URG  control  bit  set. 


Options:  variable 

Options  may  occupy  space  at  the  end  of  rhe  TCP  header  and  are  a 
multiple  of  8  bits  in  length.  All  options  are  included  in  the 
checksum.  An  option  may  begin  on  any  octet  boundary.  There  are  two 
cases  for  the  format  of  an  option: 

Case  1:  A  single  oct8t  of  option-kind. 

Case  2:  An  octet  of  option-kind,  an  octet  of  option-length,  and 
the  actual  option-data  octets. 

The  option-length  counts  the  two  octets  of  option-kind  and 
option-length  as  well  as  the  ootion-data  octets. 

Mote  that  the  list  of  options  may  bo  shorter  than  the  data  offset 
field  might  imply.  The  content  of  the  header  beyond  the 
End-of -Opt’on  option  must  be  header  padding  (i.e.,  zero). 

A  TCP  must  implement  all  options. 
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Currently 

Kind 


defined  options  include 
Length  Meaning 


(kind 


indicated  is  octal): 


0 

1 

2 


End  of  option  list. 
No-Operation. 

4  Maximum  Segment  Size. 


I 


I 


Specific  Option 


Definitions 


End  of  Option  List 


I 


| 00000000 i 

- + 

Kind=0 

This  option  code  indicates  the  end  of  the  option  list.  This 
might  not  coincide  with  the  end  of  the  TCP  header  according  to 
the  Data  Offset  field.  This  is  used  at  the  end  of  all  options, 
not  the  end  of  each  option,  and  need  only  be  used  if  the  end  of 
the  options  would  not  otherwise  coincide  with  the  end  of  the  TCP 
header. 

No-Operation 

+ - 4 

|  00000001 1 
+ - .-4 

Kind*l 

This  option  code  may  be  used  between  options,  for  example,  to 
align  the  beginning  of  a  subsequent  option  on  a  word  boundary. 
There  is  no  guarantee  that  senders  will  use  this  option,  so 
receivers  must  be  prepared  to  process  options  wwen  if  they  do 
not  begin  on  a  word  boundary. 

Maximum  Segment  Size 

+ - + - - 4 - + 

j  00000010)00000100)  max  seg  size  | 

+ - - + - - - 4 - + 

Kind=2  Length=4 


I 


I 


I 


I 
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Maximum  Segment  Size  Option  Data:  16  bits 

If  this  option  is  present,  then  it  communicates  the  maximum 
receive  segment  size  at  the  TCP  which  sends  this  segment. 

This  field  must  only  be  sent  in  the  initial  connection  request 
(i.e..  in  segments  with  the  SYN  control  bit  set).  If  this 
option  is  not  used,  any  segment  size  is  allowed. 

Padding:  variable 

The  TCP  header  padding  is  used  to  ensure  that  the  TCP  header  ends 
and  data  begins  on  a  32  bit  boundary.  The  padding  is  composed  of 
zeros. 

3  2.  Terminology 

Before  we  can  discuss  very  much  about  the  operation  of  the  TCP  we  need 
to  introduce  some  detailed  terminology.  The  maintenance  of  a  TCP 
connection  requires  the  remembering  of  several  variables.  We  conceive 
of  these  variables  being  stored  in  a  connection  record  called  a 
Transmission  Control  Block  or  TCB.  Among  the  variables  stored  in  the 
TCB  are  the  local  and  remote  socket  numbers,  the  security  and 
precedence  of  the  connection,  pointers  to  the  user's  send  and  receive 
buffers,  pointers  to  the  retransmit  queue  and  to  the  current  segment. 
In  addition  several  variables  relating  to  the  send  and  receive 
sequence  numbers  are  stored  in  the  TCB. 

Send  Sequence  Variables 

SND.UNA  -  send  unacknowledged 

SND.NXT  -  send  next 

SNQ.WND  -  send  window 

SND.UP  -  send  urgent  pointer 

SND.WL1  -  segment  sequence  number  used  for  last  window  update 
SN0.WL2  -  segment  acknowledgment  nuaher  used  for  last  window 
update 

ISS  -  initial  send  sequence  number 
Receive  Sequence  Variables 


RCV.NXT  -  receive  next 

RCV.WND  -  receive  window 

RCV.UP  -  receive  urgent  pointer 

IRS  -  initial  receive  sequence  number 
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The  following  diagrams  may  help  to  relate  some  of  these  variables  to 
the  sequence  space. 

Send  Sequence  Space 

12  3  4 

. I . I . i . ~ 

SND.UNA  SNO.NXT  SND .  UNA 

+SND.VM0 

1  -  old  sequence  numbers  which  have  been  acknowledged 

2  -  sequence  numbers  of  unacknowledged  data 

3  -  sequence  numbers  allowed  for  new  data  transmission 

4  -  future  sequence  numbers  which  are  not  yet  allowed 

Send  Sequence  Space 
Figure  4. 


# 


The  send  window  is  the  portion  of  the  sequence  space  labeled  3  in 
figure  4. 

Receive  Sequence  Space 

12  3 

- 1 - 1 - 

RCV.NXT  RCV.NXT 
+RCV.WNO 

1  -  old  sequence  numbers  which  have  been  acknowledged 

2  -  sequence  numbers  allowed  for  new  reception 

3  -  future  sequence  numbers  which  are  not  yet  allowed 

Receive  Sequence  Space 
Figure  5. 


The  receive  window  is  the  portion  of  the  sequence  space  labeled  2  in 
figure  5. 

There  are  also  some  variables  used  frequently  in  the  discussion  that 
take  tneir  values  from  the  fields  of  the  current  segment. 
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Current  Segment  Variables 

SEG.SEQ  -  segment  sequence  number 

SEG.ACK  -  segment  acknowledgment  number 

SEG.LEN  -  segment  length 

SEG.WND  -  segment  window 

SEG.UP  -  segment  urgent  pointer 

SEG.PRC  -  segment  precedence  value 

A  connection  progresses  through  a  series  of  states  during  its 
lifetime.  The  states  are:  LISTEN.  SYN-SENT.  SYN-RECEIVED, 

ESTABLISHED,  FIN-WAIT-1.  FIN-WAIT-2.  CLOSE-WAIT,  CLOSING.  LAST-ACK, 
TIME-WAIT,  and  the  fictional  state  CLOSED.  CLOSED  is  fictional 
because  it  represents  the  state  when  there  is  no  TCB,  and  therefore, 
no  connection.  Briefly  the  meanings  of  the  states  are: 

LISTEN  -  represents  waiting  for  a  connection  request  from  any  remote 
TCP  and  port. 

SYN-SENT  -  represencs  waiting  for  a  matching  connection  request 
after  having  sent  a  connection  request. 

SYN-RECEIVED  -  represents  waiting  for  a  confirming  connection 
request  acknowledgment  after  having  both  received  and  sent  a 
connection  request. 

ESTABLISHED  -  represents  an  open  connection,  data  received  can  be 
delivered  to  the  user.  The  normal  state  for  the  data  transfer  phase 
of  the  connaction. 

FIN-WAIT-1  -  represents  waiting  for  a  connection  termination  request 
from  the  remote  TCP,  or  an  acknowledgment  of  the  connection 
termination  request  previously  sent.; 

FIN-WAIT-2  -  represents  waiting  for  a  connection  termination  request 
from  the  remote  TCP. 

CLOSE-WAIT  -  represents  waiting  for  a  connection  termination  request 
'rom  the  local  user. 

CLOSING  -  represents  waiting  for  a  connection  termination  request 
acknowledgment  from  the  remote  TCP. 

LAST-ACK.  -  represents  waiting  for  an  acknowledgment  of  the 
connaction  termination  request  previously  sent  to  the  remote  TCP 
(which  includes  an  acknowledgment  of  its  connection  termination 
request) . 


[Page  21] 


September  1981 

Transmission  Control  Protocol 
Functional  Specification 


TIME-WAJ^  -  represents  waiting  for  enough  time  to  pass  to  be  sure 
the  remote  TCP  received  the  acknowledgment  of  its  connection 
termination  request. 

CLOSED  -  represents  no  connection  state  at  all. 

A  TCP  connection  orogresses  from  one  state  to  another  in  response  to 
events.  The  events  are  the  user  calls,  OPEN,  SEND,  RECEIVE,  CLOSE, 
ABORT,  and  STATUS:  the  incoming  segments,  particularly  tt.?«e 
containing  the  SYN,  ACK:  RST  and  FIN  flags;  and  timeouts. 

The  state  diagram  in  figure  6  illustrates  only  state  ctanges.  together 
with  the  causing  events  and  resulting  actions,  but  addresses  neither 
error  conditions  nor  actions  which  are  net  connected  with  state 
changes.  In  a  later  section,  more  detail  is  offered  with  respect  to 
the  reaction  of  the  TCP  to  events. 

NOTE  3ENE:  this  diagram  is  only  a  summary  and  must  not  be  taken  as 
the  total  specification. 


*»' 
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TCP  Connection  State  Diagram 
Figure  6.. 
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3.3.  Sequence  Numbers 

A  fundamental  notion  in  tbe  design  is  that  every  octet  of  data  sent 
over  a  TCP  connection  has  a  sequence  number.  Since  every  octet  is 
sequenced,  each  of  them  can  be  acknowledged.  The  acknowledgment 
mechanism  employed  is  cumulative  so  that  an  acknowledgment  of  sequence 
number  X  indicates  that  all  octets  up  to  but  not  including  X  have  been 
received.  This  mechanism  allows  for  straight-forward  duplicate 
detection  in  the  presence  of  retransmission.  Numbering  of  octets 
within  a  segment  is  that  the  first  data  octet  immediately  following 
the  header  is  the  lowest  numbered,  and  the  following  octets  are 
numbered  consecutive1y . 

It  is  essential  to  remember  that  the  actual  sequence  number  space  is 
finite,  though  very  large.  This  space  ranges  from  0  to  2**32  -  1. 
Since  the  space  is  finite,  all  arithmetic  dealing  with  sequence 
numbers  must  be  performed  modulo  2** 32.  This  unsigned  arithmetic 
preserves  the  relationship  of  sequence  numbers  as  they  cycle  from 
2**32  -  1  to  0  again.  There  are  some  subtleties  to  computer  modulo 
arithmetic,  so  great  care  should  be  taken  in  programming  the 
comparison  of  such  values.  The  symbol  "*<"  means  "less  than  or  equal” 
(modulo  2**32). 

The  typical  kinds  of  sequence  number  comparisons  which  the  TCP  must 
perform  include: 

(a)  Determining  that  an  acknowledgment  refers  to  some  sequence 
number  sent  but  not  yet  acknowledged. 

(b)  Determining  that  all  sequence  numbers  occupied  by  a  segment 
have  been  acknowledged  (e.g.,  to  remove  the  segment  from  a 
retransmission  queue). 

(c)  Determining  that  an  incoming  segment  contains  sequence  number*, 
which  are  expected  (i.e.,  that  the  segment  ’overlaps"  the 
receive  window). 


* 


* 
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In  response  to  sending  data  the  TCP  will  receive  acknowledgments.  The 
following  comparisons  are  needed  to  process  the  acknowledgments. 

SND.UNA  =  oldest  unacknowledged  sequence  number 

SND.NXT  *  next  sequence  number  to  be  sent 

SEG.ACK  *  acknowledgment  from  the  receiving  TCP  (next  sequence 
number  expected  by  the  receiving  TCP) 

SEG.SEQ  *  first  sequence  number  of  a  segment 

SEG.LEN  *  the  number  of  octets  occupied  by  the  data  ir«  the  segment 
(counting  SYN  and  FIN) 

SEG.SEQ+SEG. LEN-1  *  last  sequence  number  of  a  segment 

A  new  acknowledgment  (called  an  "acceptable  ack"),  is  one  for  which 
the  inequality  below  holds: 

SND.UNA  <  SEG.ACK  «<  SND.NXT 

A  segment  on  the  retransmission  queue  is  fully  acknowledged  if  the  sum 
of  its  sequence  number  and  length  is  less  or  equal  than  the 
acknowledgment  value  in  the  incoming  segment. 

When  data  is  received  the  following  comparisons  are  needed: 

RCV.NXT  *  next  sequence  number  expected  on  an  incoming  segments,  and 
is  the  left  or  lower  edge  of  the  receive  window 

RCV.NXT+RCV. WND-1  ■  last  sequence  number  expected  on  an  incoming 
segment,  and  is  the  right  or  upper  edge  of  the  receive  window 

SEG.SEQ  *  first  sequence  number  occupied  by  the  incoming  segment 

SEG.SEQ+SEG. LEN-1  *  last  sequence  number  occupied  by  the  incoming 
segment 

A  segment  is  judqed  to  occupy  a  portion  of  valid  receive  sequence 
space  if 

RCV.NXT  *<  SEG.SEQ  <  RCV. NXT+RCV.WND 
or 

RCV.NXT  =<  SEG.SEQ+SEG. LEN-1  <  RCV . NXT+RCV . WND 


» 
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The  first  part  of  this  test  checks  to  see  if  the  beginning  of  the 
segment  falls  in  the  window,  the  second  part  of  the  test  checks  to  see 
if  the  end  of  the  segment  falls  in  the  window;  if  the  segment  passes 
either  part  of  the  test  it  contains  data  in  the  window. 

Actually,  it  is  a  little  more  complicated  than  this.  Due  to  zero 
windows  and  zero  length  segments,  we  have  four  cases  for  the 
acceptability  of  an  incoming  segment: 

Segment  Receive  Test 
Length  Window 


0 

0 

SEG.SEQ  =  RCV.NXT 

0 

>0 

RCV.NXT  =<  SEG.SEQ  < 

RCV.NXT +RCV . WND 

>0 

0 

not  acceptable 

>0 

>0 

RCV.NXT  *<  SEG.SEQ  < 

RCV. NXT+RCV. WND 

or  RCV.NXT  =><  SEG.SEQ+SEG.  LEN-1  <  RCV.NXT+RCV.  WND 

Note  that  when  the  receive  window  is  zero  no  segments  should  be 
acceptable  except  ACK  segments.  Thus,  it  is  be  possible  for  a  TCP  to 
maintain  a  zero  receive  window  wh.le  transmitting  data  and  receiving 
ACKs.  However,  even  when  the  receive  window  is  zero,  a  TCP  must 
process  the  RST  ar.d  URG  fields  of  all  incoming  segments. 

We  have  taken  advantage  of  the  numbering  scheme  to  protect  certain 
control  information  as  well.  This  is  achieved  by  implicitly  including 
some  control  flags  in  the  sequence  space  so  they  can  be  retransmitted 
and  acknowledged  without  confusion  (i.e.,  one  and  only  one  copy  of  the 
control  will  be  acted  upon).  Control  information  is  not  physically 
carried  in  the  segment  data  space.  Consequently,  we  must  adopt  rules 
for  implicitly  assigning  sequence  numbers  to  control.  The  SYN  and  FIN 
are  the  only  controls  requiring  this  protection,  and  these  controls 
are  used  only  at  connection  opening  and  closing.  For  sequence  number 
purposes,  the  SYN  is  considered  to  occur  before  the  fir„*t  actual  data 
octet  of  the  segment  in  which  it  occurs,  while  the  FIN  is  considered 
to  occur  after  the  last  actual  data  octet  in  a  segment  in  which  it 
occurs.  The  segment  length  (SEG.LEN)  includes  both  data  and  sequence 
space  occupying  controls.  When  a  SYN  is  present  then  SEG.SEQ  is  the 
sequence  number  of  the  SYN. 
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Initial  Sequence  Number  Selection 

The  protocol  places  no  restriction  on  a  particular  connection  being 
used  over  and  over  again.  A  connection  is  defined  by  a  pair  of 
sockets.  New  instances  of  a  connection  will  be  referred  to  as 
incarnations  of  the  connection.  The  problem  that  arises  from  this  is 
—  "how  does  the  TCP  identify  duplicate  segments  from  previous 
incarnations  of  the  connection?"  This  problem  becomes  apparent  if  the 
connection  is  being  opened  and  closed  in  quick  succession,  or  if  the 
connection  breaks  with  loss  of  memory  and  is  then  reestablished. 

To  avoid  confusion  we  must  prevent  segments  from  one  incarnation  of  a 
connection  from  being  used  while  the  same  sequence  numbers  may  still 
be  present  in  the  network  from  an  earlier  incarnation.  We  want  to 
assure  this,  even  if  a  TCP  crashes  and  loses  all  knowledge  of  the 
sequence  numbers  it  has  been  using.  When  new  connections  are  created, 
an  initial  sequence  number  (ISN)  generator  is  employed  which  selects  a 
new  32  bit  ISN.  The  generator  is  bound  to  a  (possibly  fictitious)  32 
bit  clock  whose  low  order  bit  is  incremented  roughly  every  4 
microseconds.  Thus,  the  ISN  cycles  approximately  every  4.55  hours. 
Since  we  assume  that  segments  will  stay  in  the  network  no  more  than 
the  Maximum  Segment  Lifetime  (MSL)  and  that  the  MSL  is  less  than  4.55 
hours  we  can  reasonably  assume  that  ISN’s  will  be  unique. 

For  each  connection  there  is  a  send  sequence  number  and  a  receive 
sequence  number.  The  initial  send  sequence  number  (ISS)  is  chosen  by 
the  data  sending  TCP,  and  the  initial  receive  sequence  number  (IRS)  is 
learned  during  the  connection  establishing  procedure. 

For  a  connection  to  be  established  or  initialized,  the  two  TCPs  must 
synchronize  on  each  other's  initial  sequence  numbers.  This  is  done  in 
an  exchange  of  connection  establishing  segments  carrying  a  control  bit 
called  "SYN"  (for  synchronize)  and  the  initial  sequence  numbers.  As  a 
shorthand,  segments  carrying  the  SYN  bit  are  also  called  "SYNs". 

Hence,  the  solution  requires  a  suitable  mechanism  for  picking  an 
initial  sequence  number  and  a  slightly  involved  handshake  to  exchange 
the  ISN’s. 
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The  synchronization  requires  each  side  to  send  it's  own  initial 

sequence  number  and  to  receive  a  confirmation  of  it  in  acknowledgment 

from  the  other  side.  Each  side  must  also  receive  the  other  side's 

initial  sequence  number  and  ‘and  a  confirming  acknowledgment.  • 


1) 

A 

—  > 

B 

SYN 

2) 

A 

<-- 

B 

ACK 

3) 

A 

<-- 

B 

SYN 

4) 

A 

—  > 

B 

ACK 

my  sequence  number  is  X 
your  sequence  number  is 
my  sequence  number  is  Y 
your  sequence  number  is 


X 

Y 
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Because  steps  2  and  3  can  be  combined  in  a  single  message  this  is 
called  the  three  way  (or  three  message)  handshake. 

A  three  way  handshake  is  necessary  because  sequence  numbers  are  not 
tied  to  a  global  clock  in  the  network,  and  TCPs  may  have  different 
mechanisms  for  picking  the  ISN's.  The  receiver  of  the  first  SYN  has 
no  way  of  knowing  whether  the  segment  was  an  old  delayed  one  or  not. 
unless  it  remembers  the  last  sequence  number  used  on  the  connection 
(which  is  not  always  possible),  and  so  it  must  ask  the  sender  to 
verify  this  SYN.  The  three  way  handshake  and  the  advantages  of  a 
clock-driven  scheme  arr  discussed  in  [3]. 

Knowing  When  to  Keep  Quiet 

To  be  sure  that  a  TCP  does  not  create  a  segment  that  carries  a 
sequence  number  which  may  be  duplicated  by  an  old  segment  remaining  in 
the  network,  the  TCP  must  keep  quiet  for  a  maximum  segment  lifetime 
(MSL)  before  assigning  any  sequence  numbers  upon  starting  up  or 
recovering  from  a  crash  in  which  memory  of  sequence  numbers  in  use  was 
lost.  For  this  specification  the  MSL  is  taken  to  be  2  minutes.  This 
is  an  engineering  choice,  and  may  be  changed  if  experience  indicates 
it  is  desirable  to  do  so.  Note  that  if  a  TCP  *'s  reinitialized  in  some 
sense,  yet  retains  its  memory  of  sequence  numbers  in  use.  then  it  need 
not  wait  at  all;  it  must  only  be  sure  to  use  sequence  numbers  larger 
than  those  recently  used. 

The  TCP  Quiet  Time  Concept 

This  specification  provides  that  hosts  which  "crash"  without 
retaining  any  knowledge  of  the  last  sequence  numbers  transmitted  on 
each  active  (i.e..  not  closed)  connection  shall  delay  emitting  any 
TCP  segments  for  at  least  the  agreed  Maximum  Segment  Lifetime  (MSL) 
in  the  internet  system  of  which  the  host  is  a  part.  In  the 
paragraphs  below,  an  explanation  for  this  specification  is  given. 

TCP  implementors  may  violate  the  "quiet  time"  restriction,  but  only 
at  the  risk  of  causing  some  old  data  to  be  accepted  as  new  or  new 
data  rejected  as  old  duplicated  by  some  receivers  in  the  internet 
system. 

TCPs  consume  sequence  number  space  each  time  a  segment  is  formed  and 
entered  into  the  network  output  queue  at  a  source  host.  The 
duplicate  detection  and  sequencing  algorithm  in  the  TCP  protocol 
relies  on  the  unique  binding  of  segment  data  to  sequence  space  to 
the  extent  that  sequence  numbers  will  not  cycle  through  all  *2»*32 
values  before  the  segment  data  bound  to  those  sequence  numbers  has 
been  delivered  and  acknowledged  by  the  receiver  and  all  duplicate 
copies  of  the  segments  have  "drained"  from  the  internet.  Without 
such  an  assumption,  two  distinct  TCP  segments  could  conceivably  be 
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assigned  the  same  or  overlapping  sequence  numbers,  causing  confusion 
at  the  receiver  as  to  which  data  is  new  and  which  is  old.  Remember 
that  each  segment  is  bound  to  as  many  consecutive  sequence  numbers 
as  there  are  octets  of  data  in  the  segment. 

Under  normal  conditions.  TCPs  keep  track  of  the  next  sequence  number  0 

to  emit  and  the  oldest  awaiting  acknowledgment  so  as  to  avoid 

mistakenly  using  a  sequence  number  over  before  its  first  use  has 

been  acknowledged.:  This  alone  does  not  guarantee  that  old  duplicate 

data  is  drained  from  the  net,  so  the  sequence  space  has  been  made 

very  large  to  reduce  the  probability  that  a  wandering  duolicate  will 

cause  trouble  upon  arrival.  At  2  megabits/sec.  it  takes  4.5  hours  p 

to  use  up  2**32  octets  of  sequence  space.  Since  the  maximum  segment 

lifetime  in  the  net  is  not  likely  to  exceed  a  few  tens  of  seconds. 

this  is  deemed  ample  protection  for  foreseeable  nets,  even  if  data 

rates  escalate  to  10's  of  megabits/sec.  At  100  megabits/sec,  the 

cycle  time  is  5.4  minutes  which  may  be  a  little  short,  but  still 

within  reason.  p 

The  basic  duplicate  detection  and  sequencing  algorithm  i.  ‘  can  be 

defeated,  however,  if  a  source  TCP  does  rot  have  any  memo.;  of  the 

sequence  numbers  it  last  used  on  a  given  connection.  For  example,  if 

the  TCP  were  to  start  all  connections  with  sequence  number  0,  then 

upon  crashing  and  restarting,  a  TCP  might  re-form  an  earlier  p 

connection  (possibly  after  half-open  connection  resolution)  and  emit 

packets  with  sequence  numbers  identical  to  or  overlapping  with 

packets  still  in  the  network  which  were  emitted  on  an  earlier 

incarnation  of  the  same  connection.  In  the  absence  of  knowledge 

about  the  sequence  numbers  used  on  a  particular  connection,  the  TCP 

specification  recommends  that  the  source  delay  for  MSL  seconds  V 

before  emitting  segments  on  the  connection,  to  allow  time  for 

segments  from  the  earlier  connection  incarnation  to  drain  from  the 

system. 

Even  hosts  which  can  remember  the  time  of  day  and  used  it  to  select 
initial  sequence  number  values  are  not  immune  from  this  problem  * 

(i.e.,  even  if  time  of  day  is  used  to  select  an  initial  sequence 
number  for  each  new  connection  incarnation). 

Suppose,  for  example,  that  a  connection  is  opened  starting  with 
sequence  number  S.  Suppose  that  this  connection  is  not  used  much 
and  that  eventually  the  initial  sequence  number  function  (ISN(t))  * 

takes  on  a  value  equal  to  the  sequence  number,  say  Si,  of  the  last 
segment  sent  by  this  TCP  on  a  particular  connection.  Now  suppose, 
at  this  instant,  the  host  crashes,  recovers,  and  establishes  a  new 
incarnation  of  the  connection.  The  initial  sequence  number  „hosen  is 
Z1  *  ISN(t)  —  last  used  sequence  number  on  old  incarnation  of 
connection!.  If  the  recovery  occurs  quickly  enough,  any  old 
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duplicates  in  the  net  bearing  sequence  numbers  in  the  neighborhood 
of  SI  may  arrive  and  be  treated  as  new  packets  by  the  receiver  of 
the  new  incarnation  of  the  connection. 

The  problem  is  that  the  recovering  host  may  not  know  for  how  long  it 
crashed  nor  does  it  know  whether  there  are  still  old  duplicates  in 
the  system  from  earlier  connection  incarnations. 

One  way  to  deal  with  this  problem  is  to  deliberately  delay  emitting 
segments  for  one  MSL  after  recovery  from  a  crash-  this  is  the  "quite 
time"  specification.  Hosts  which  prefer  to  avoid  waiting  are 
willing  to  risk  possible  confusion  of  old  and  new  packets  at  a  given 
destination  may  choose  not  to  wait  fo:  the  "quite  time". 

Implementors  may  provide  TCP  users  with  the  ability  to  select  on  a 
connection  by  connection  basis  whether  to  wait  after  a  crash,  or  may 
informally  implement  the  "quite  time"  for  all  connections. 

Obviously,  even  where  a  user  selects  to  "wait,"  this  is  not 
necessary  after  the  host  has  been  "up"  for  at  least  MSL  seconds. 

To  summarize:,  eve  y  segment  emitted  occupies  one  or  more  sequence 
numbers  in  the  sequence  space,  the  numbers  occupied  by  a  segment  are 
"busy"  or  "in  use"  until  MSL  seconds  have  passed,  upon  crashing  a 
block  of  space-time  is  occupied  by  the  octets  of  the  last  emitted 
segment,  if  a  new  connection  is  started  too  soon  and  uses  any  of  the 
sequence  numbers  in  the  space-time  footprint  of  the  last  segment  of 
the  previous  connection  incarnation,  there  is  a  potential  sequence 
number  overlap  area  which  could  cause  confusion  at  the  receiver. 

3.4.  Establishing  a  connection 

The  "three-way  handshake"  is  the  procedure  used  to  establish  a 
connection.  This  procedure  normally  is  initiated  by  one  TCP  and 
responded  to  by  another  TCP.  The  procedure  also  works  if  two  TCP 
simultaneously  'Vitiate  the  procedure.  When  simultaneous  attempt 
occurs,  each  TCP  receives  a  "SYN"  segment  which  carries  no 
acknowledgment  after  it  has  sent  a  "SYN"..  Of  course,  the  arrival  of 
an  old  duplicate  "SYN"  segment  can  potentially  make  it  appear,  to  the 
recipient,  that  a  simultaneous  connection  initiation  is  in  progress. 
Proper  use  of  "reset"  segments  can  disambiguate  these  cases. 

Several  examples  of  connection  initiation  follow.  Although  these 
examples  do  not  show  connection  synchronization  using  data-carry i ng 
segments,  this  is  perfectly  legitimate,  so  long  as  the  receiving  TCP 
doesn't  deliver  the  data  to  the  user  until  it  is  clear  the  data  is 
valid  (i;.e.,  the  data  must  be  buffered  at  the  receiver  until  the 
connection  reaches  the  ESTABLISHED  state).  The  three-way  handshake 
reduces  the  possibility  of  false  connections.  It  is  the 
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implementation  of  a  trade-off  between  memory  and  messages  to  provide  & 

information  for  this  checking. 

The  simplest  three-way  handshake  is  shown  in  figure  7  below.  The 
figures  should  be  interpreted  in  the  following  way.  Each  line  is 
numbered  for  reference  purposes.  Right  arrows  ( — >)  indicate 

departure  of  a  TCP  segment  from  TCP  A  to  TCP  8,  or  arrival  of  a  * 

segment  at  B  from  A.  Left  arrows  (< — ),  indicate  the  reverse. 

Ellipsis  (...)  indicates  a  segment  which  is  still  in  the  network 
(delayed).  An  "XXX"  indicates  a  segment  which  is  lost  or  rejected. 

Comments  appear  in  parentheses.  TCP  states  represent  the  state  AFTER 

the  departure  or  arrival  of  the  segment  (whose  contents  are  shown  in 

the  center  of  each  line).  Segment  contents  are  shown  in  abbreviated  * 

form,  with  sequence  number,  control  flags,  and  ACK  field.  Other 

fields  such  as  window,  addresses,  lengths,  and  text  have  been  left  out 

in  the  interest  of  clarity. 


1. 
2  .; 

•J  . 

4, 

5, 


TCP  A 
CLOSED 

SYN-SENT  --> 
ESTABLISHED  <-- 
ESTABLISHED  --> 
ESTABLISHED  --> 
Basic  3-»/ay 


TCP  B 
LISTEN 

<SEQ=100XCTL=SYN>  -->  SYN-RECEIVED 
<SEQ=300XACK  =  101XCTL=SYN , ACK>  <--  SYN-RECEIVED 
<5EQ=1Q1XACK=301XCTL=ACK>  -->  ESTABLISHED 
<SEQ=101XACK  =  301XCTL=ACKXDATA>  -->  ESTABLISHED 
Handshake  for  Connection  Synchronization 


Figure  7. 

In  line  2  of  figure  7,  TCP  A  begins  by  sending  a  SYN  segment 
indicating  that  it  will  use  sequence  numbers  starting  with  sequence 
number  100.  In  line  3.  TCP  B  sends  a  SYN  and  acknowledges  the  SYN  it 
received  from  TCP  A.  Note  that  the  acknowledgment  field  indicates  TCP 
8  is  now  expecting  to  hear  sequence  101,  acknowledging  the  SYN  which 
occupied  sequence  100. 


At  line  4,  TCP  A  responds  with  an  empty  segment  containing  an  ACK  for 
TCP  8 ' s  SYN;  and  in  line  5.  TCP  A  sends  some  data.  Note  tnat  the 
sequence  number  of  the  segment  in  line  5  is  the  same  as  in  line  4 
because  the  ACK  does  not  occupy  sequence  number  space  (if  it  did,  we 
would  wind  up  ACKing  ACK's!). 
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Simultaneous  initiation  is  only  slightly  more  complex,  as  is  shown  in 
figure  8.  Each  TCP  cycles  from  CLOSED  to  SYN-SENT  to  SYN-RECEIVED  to 
ESTABLISHED. 


TCP  A 

TCP 

B 

1. 

CLOSED 

CLOSED 

2. 

SYN-SENT 

— > 

<S£Q*100XCTL=S’'N> 

3. 

SYN-RECEIVED 

<-- 

<SEQ*300XCTL  =  SYN> 

<-- 

SYN-SENT 

4. 

•  *  * 

<SEQ=1Q0XCTL=SYN> 

— > 

SYN-RECEIVED 

5. 

SYN-RECEIVED 

--> 

<SEQ=100XACX=301XCTL=SYN,ACK> 

•  >  • 

6. 

established 

<  — 

<SEQ=300XACK=101XCTL=SYN,ACK> 

<-- 

SYN-RECEIVED 

7. 

<SEQ*1C1XACK=301XCTL=ACK> 

—  > 

ESTABLISHED 

Simultaneous  Connection  Synchronization 
Figure  8. 

The  principle  reason  for  the  three-way  handshake  is  to  prevent  old 
duplicate  connection  initiations  from  causing  confusion.  To  deal  with 
this,  a  special  control  message,  reset,  has  been  devised..  If  the 
receiving  TCP  is  in  a  no<  -synchronized  state  (i.e.,  SYN-SENT, 
SYN-RECEIVED),  it  returns  to  LISTEN  on  receiving  an  acceptable  reset. 
If  the  TCP  is  in  one  of  the  synchronized  states  (ESTABLISHED, 

FIN-WAIT* 1 ,  FIN-WAIT-2.  CLOSE-WAIT,  CLOSING,  LAST-ACK,  TIME-WAIT),  it 
aborts  the  connection  and  informs  its  user.  We  discuss  this  latter 
case  under  "half-open"  connections  below. 
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TCP  A 

TCP  B 

1. 

CLOSED 

LISTEN 

2. 

SYN-SENT 

--> 

<SEQ«100XCTL*SYN> 

3. 

( dupl icate) 

<SEQ=90XCTL  =  SYN> 

-->  SYN-RECEIVED 

4. 

SYN-SENT 

<-- 

<S£Q*300XACK=91XCTL*SYN  ,ACK> 

<  —  SYN-RECEIVED 

5. 

SYN-SENT 

— > 

<SEQ*91XCTL=>RST> 

— >  LISTEN 

S. 

.  .  . 

<SEQ«100XCTL*SYN> 

-->  SYN-RECEIVED 

7. 

SYN-SENT 

<-- 

<SEQ*400XACK*101XCTL=SYN,ACK> 

<--  SYN-RECEIVED 

8. 

ESTABLISHED 

--> 

<SEQ*101XACK*401XCTL»ACK> 

— >  established 

Recovery  from  Old  Duplicate  SYN 

Figure  9. 

As  a  simple  example  of  recovery  from  old  duplicates,  consider 
figure  9.  At  line  3,  an  old  duplicate  SYN  arrives  at  TCP  B.  TCP  B 
cannot  tell  that  this  is  an  old  duplicate,  so  it  responds  normally 
(line  4).  TCP  A  detects  that  the  ACK  field  is  incorrect  and  returns  a 
PST  (reset)  with  its  SEQ  field  selected  to  make  the  segment 
believable.  TCP  B,  on  receiving  the  RST ,  returns  to  the  LISTEN  state. 
When  the  original  SYN  (pun  intended)  finally  arrives  at  line  6,  the 
synchronization  proceeds  normally.  If  the  SYN  at  line  6  had  arrived 
before  the  RST,  a  more  complex  exchange  might  have  occurred  with  RST’s 
sent  in  both  directions. 

Half-Open  Connections  and  Other  Anomalies 


W 


¥ 


An  established  connection  is  said  to  be  "half-open"  if  one  of  the 
TCPs  has  closed  or  aborted  the  connection  at  its  end  without  the 
knowledge  of  the  other,  or  if  the  two  ends  of  the  connection  have 
become  desynchronized  owing  to  a  crash  that  resulted  in  loss  of 
memo:,.  Such  connections  will  automatically  become  reset  if  an 
attempt  is  made  to  send  data  in  either  direction.  However,  half-open 
connections  are  expected  to  be  unusual,  and  the  recovery  procedure  ir 
mildly  involved. 

If  at  site  A  the  connection  no  longer  exists,  then  an  attempt  by  the 
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user  at  site  B  to  send  any  data  on  it  will  result  in  the  site  B  TCP 
receiving  a  reset  control  message.  Such  a  message  indicates  to  the 
site  B  TCP  that  something  is  wrong,  and  it  is  expected  to  abort  the 
connection.. 

Assume  that  two  user  processes  A  and  B  are  communicating  with  one 
another  when  a  crash  occurs  causing  loss  of  memory  to  A's  fCP. 
Depending  on  the  operating  system  supporting  A's  TCP.  it  is  likely 
that  some  error  recovery  mechanism  exists.  When  the  TCP  is  up  again, 

A  is  likely  to  start  again  from  the  beginning  or  from  a  recovery 
point.  As  a  result,  A  will  probably  try  to  OPEN  the  connection  again 
or  try  to  SENO  on  the  connection  it  believes  open.  In  the  latter 
case,  it  receives  the  error  message  "connection  not  open"  from  the 
local  (A's)  TCP.  In  an  attempt  to  establish  the  connection.  A’s  TCP 
will  send  a  segment  containing  SYN.  This  scenario  leads  to  the 
example  shown  in  figure  10.  After  TCP  A  crashes,  the  user  attempts  to 
re-open  the  connection.  TCP  B,  in  the  meantime,  thinks  the  connection 
is  open. 


TCP  A 


TCP  B 


1. 

(CRASH) 

( send 

300, receive  100) 

2. 

CLOSED 

ESTABLISHED 

3. 

SYN-SENT 

— > 

<SEQ=400XCTL*SYN> 

~>  (??) 

4. 

(!!) 

<-- 

<SEQ*3Q0><ACK3100XCTL3ACK> 

<—  ESTABLISHcD 

5. 

SYN-SENT 

--> 

<SEQ  =  100XCTL  =  RST> 

— >  (Abort!!) 

6. 

SYN-SENT 

CLOSED 

7. 

SYN-SENT 

--> 

<SEQ=400XCTL  =  SYN> 

--> 

Half-Open  Connection  Discovery 
Figure  10. 

When  the  SYN  arrives  at  line  3,  TCP  B,  being  in  a  synchronized  state, 
and  the  incom-ng  segment  outside  the  window,  responds  with  an 
acknowledgment  indicating  what  sequence  it  next  expects  to  hear  (ACK 
10C),  TCP  A  sees  that  this  segment  does  not  acknowledge  anything  it 
sent  and.  being  unsynchronized,  sends  a  reset  (RST)  because  it  has 
detected  a  half-open  connection.  TCP  B  aborts  at  line  5,  TCP  A  will 


I 


» 


9 
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continue  to  try  to  establish  the  connection;  the  problem  is  now 
reduced  to  the  basic  3-»ay  handshake  of  figure  7. 

An  interesting  alternative  case  occurs  when  TCP  A  crashes  and  TCP  B 
tries  to  send  data  on  what  it  thinks  is  a  synchronised  connection. 
This  is  illustrated  in  figure  11.  In  this  case,  the  data  arriving  at 
TCP  A  from  TCP  B  (line  2)  is  unacceptable  because  no  such  connection 
exists,  so  TCP  A  sends  a  RST.  The  RST  is  acceptable  so  TCP  B 
processes  it  and  aborts  the  connection. 


TCP  A  TCP  B 

1.  (CRASH)  (send  300, receive  1J0) 

2.  (??)  <—  <5EQ  =  30CXACK=100XDATA=10XCTL=ACK>  ESTABLISHED 

3.  —  >  <SEQ*100XCTL»RST>  — >  (ABORT!  !)= 

Active  Side  Causes  Half-Open  Connection  Discovery 

Figure  11. 

In  figure  12,  we  find  the  two  TCPs  A  and  B  with  passive  connections 
waiting  for  SYN.  An  old  duplicate  arriving  at  TCP  B  (line  2)  stirs  B 
into  action.  A  SYN-ACK  is  returned  (line  3)  and  causes  TCP  A  to 
generate  a  RST  (the  ACK  in  line  3  is  not  acceptable).  TCP  B  accepts 
the  reset  and  returns  to  its  passive  LISTEN  state. 


TCP  A 

TCP  B 

1, 

LISTEN 

LISTEN 

2. 

•  *  * 

<SEQ=ZXCTL«SYN> 

— > 

SYN-P.ECEIVED 

3. 

(??)  <~ 

<SEQ=XXACK*Z+1XCTL=SYN,  ACK> 

<-- 

SYN-RECEIVED 

4.. 

--> 

<SEQ  =  Z+1XCTL=RST> 

— > 

(return  to  LISTEN!) 

5. 

LISTEN 

LISTEN 

Old  Duplicate  SYN  Initiates  a  Reset  on 

two 

Passive  Sockets 

Figure  12. 
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A  variety  of  other  cases  are  possible,  all  of  which  are  accounted  for 
by  the  following  rules  for  RST  generation  and  processing. 

Reset  Generation 

As  a  general  rule,  reset  (RST)  must  be  sent  whenever  a  segment  arrives 
which  apparently  is  not  intended  for  the  current  connection.  A  reset 
must  not  be  sent  if  it  is  not  clear  that  this  is  tne  case. 

There  are  three  groups  of  states: 

1.  If  the  connection  does  not  exist  (CLOSED)  then  a  reset  is  sent 
in  response  to  any  incoming  segment  except  another  reset.  In 
particular.  SYNs  addressed  to  a  non-existent  connection  are  rejected 
by  this  means. 

If  the  incoming  segment  has  an  ACK  field,  the  reset  takes  its 
sequence  number  from  the  ACK  field  of  the  segment,  otherwise  the 
reset  has  seauence  number  zero  and  the  ACK  field  is  set  to  the  sum 
of  the  sequence  number  and  segment  length  of  the  incoming  segment. 
The  connection  remains  in  the  CLOSED  state. 

2.  If  the  connection  is  in  any  non-synehronized  state  (LISTEN, 
SYN-SENT,  SYN-RECE IVED ) ,  and  the  incoming  segment  acknowledges 
something  not  yet  sent  (the  segment  carries  an  unacceptable  ACK),  or 
if  an  incoming  segment  has  a  security  level  or  compartment  which 
does  not  exactly  match  the  level  and  compartment  requested  for  the 
connection,  a  reset  is  sent. 

If  our  SYN  has  not  been  acknowledged  and  the  precedence  level  of  the 
incoming  segment  is  higher  than  the  pre-.edence  level  requested  then 
either  raise  the  local  precedence  level  (if  allowed  by  the  user  and 
the  system)  or  send  a  reset;  or  if  the  precedence  level  of  the 
incoming  segment  is  lower  than  the  precedence  level  requested  then 
continue  as  if  the  precedence  matched  exactly  (if  the  remote  TCP 
cannot  raise  the  precedence  level  to  match  ours  this  will  be 
detected  in  the  next  segment  it  sends,  and  the  connection  will  be 
terminated  then).  If  our  SYN  has  been  acknowledged  (perhaps  in  this 
incoming  segment)  the  precedence  level  of  the  incoming  segment  must 
match  the  local  precedence  level  exactly,  if  it  does  not  a  reset 
must  be  sent. 


If  the  incoming  segment  has  an  ACK 
sequence  number  from  the  ACK  field 
reset  has  sequence  number  zero  and 
of  the  sequence  number  and  segment 
The  connection  remains  in  the  same 


field,  the  reset  takes  its 
of  the  segment,  otherwise'tne 
the  ACK  field  is  set  to  the  sum 
length  of  the  incoming  segment, 
state. 


> 
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3.  If  the  connection  is  in  a  synchronized  state  (ESTABLISHED, 
FIN-WAIT- 1 ,  FIN-WAIT-2.  CLOSE-WAIT,  CLOSING.  LAST-ACK.  TIME-WAIT), 
any  unacceptable  segment  (out  of  window  sequence  number  or 
'jnacceptible  acknowledgment  number)  must  elicit  only  an  empty 
acknowledgment  segment  containing  the  current  send-sequence  number 
and  an  acknowledgment  indicating  the  next  sequence  number  expected 
to  be  received,  and  the  connection  remains  in  the  same  state. 

If  an  incoming  segment  has  a  security  level,  or  compartment,  or 
precedence  which  does  not  exactly  match  the  level,  and  compartment, 
and  precedence  requested  for  the  connection, a  reset  is  sent  and 
connection  goes  to  the  CLOSED  state.  The  reset  takes  its  sequence 
number  from  the  ACK  field  of  the  incoming  segment. 

Reset  Processing 

In  all  states  except  SYN-SENT,  all  reset  (RST)  segments  are  validated 
by  checking  their  SEQ-fields.  A  reset  is  valid  if  its  sequence  number 
is  in  the  window.  In  the  SYN-SENT  state  (a  RST  received  in  response 
to  an  initial  SYN),  the  RST  is  acceptable  if  the  ACK  field 
acknowledges  the  SYN. 

The  receiver  of  a  RST  first  validates  it,  then  changes  state.  If  the 
receiver  was  in  the  LISTEN  state,  it  ignores  it.  If  the  receiver  was 
in  SYN-RECEIVED  state  and  had  previously  been  in  the  LISTEN  state, 
then  the  receiver  returns  to  the  LISTEN  state,  otherwise  the  receiver 
aborts  the  connection  and  goes  to  the  CLOSED  state.  If  the  receiver 
was  in  any  other  state,  it  aborts  the  connection  and  advises  the  user 
and  goes  to  the  CLOSED  state. 

3.5.  Closing  a  Connection 

CLOSE  is  an  operation  meaning  "I  have  no  more  data  to  send."  The 
notion  of  closing  a  full-duplex  connection  is  subject  to  ambiguous 
interpretation,  of  course,  since  it  may  not  be  obvious  how  to  treat 
the  receiving  side  of  the  connection.  We  have  chosen  to  treat  CLOSE 
in  a  simplex  fashion.  The  user  who  CLOSEs  m3y  continue  to  RECEIVE 
until  he  is  told  that  the  other  side  has  CLOSED  also.  Thus,  a  program 
could  initiate  several  SENDs  followed  by  a  CLOSE,  and  then  continue  to 
RECEIVE  until  signaled  that  a  RECEIVE  failed  because  the  other  side 
has  CLOSED.  We  assume  that  the  TCP  will  signal  a  user,  even  if  no 
RECEIVES  are  r  tstanding.  that  the  other  side  has  closed,  so  the  user 
can  terminate  his  side  gracefully.  A  TCP  will  reliably  deliver  all 
buffers  SENT  before  the  connection  was  CLOSED  so  a  user  who  expects  no 
data  in  return  need  only  wait  to  hear  the  connection  was  CLOSED 
successfully  to  know  that  all  his  data  was  received  at  the  destination 
TCP,  Users  must  keep  reading  connections  they  close  for  sending  until 
the  TCP  says  no  more  data. 
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There  are  essentially  three  cases: 

1)  The  user  initiates  by  telling  the  TCP  to  CLOSE  the  connection 

2)  The  remote  TCP  initiates  by  sending  a  FIN  control  signal 

3)  Both  users  CLOSE  simultaneously 

Case  1:  Local  user  initiates  the  close 

In  this  case,  a  FIN  segment  can  oe  constructed  and  placed  on  the 
outgoing  segment  queue.  No  further  SENDS  from  the  user  will  be 
accepted  by  the  TCP.  and  it  enters  the  FIN-WAIT-1  state.  RECEIVES 
are  allowed  in  this  state..  All  segments  preceding  and  including  FIN 
will  be  retransmitted  until  acknowledged .  When  the  other  TCP  has 
Doth  acknowledged  the  FIN  and  sent  a  FIN  of  its  own,  the  first  TCP 
can  ACK  this  FIN.  Note  that  a  TCP  receiving  a  FIN  will  ACK.  but  not 
send  its  own  FIN  until  its  user  has  CLOSED  the  connection  also. 

Case  2;  TCP  receives  a  FIN  from  the  network 

I f  an  unsolicited  FIN  arrives  from  the  network,  the  receiving  TCP 
can  ACK  it  and  tell  the  user  that  the  connection  is  closing.  The 
user  will  respond  with  a  CLOSE,  upon  which  the  TCP  can  send  a  FIN  to 
the  other  TCP  after  sending  any  remaining  data..  Tne  TCP  then  waits 
until  its  own  FIN  is  acknowledged  wheremon  it  deletes  the 
connection.  If  an  ACK  is  not  forthcoming,  after  the  user  timeout 
the  connection  is  aborted  and  the  user  is  told. 


Case  3:  both  users  close  simultaneously 

A  simultaneous  CLOSE  by  users  at  both  ends  of  a  connection  causes 
FIN  segments  to  be  exchanged.  When  all  segments  preceding  the  FINs 
have  been  processed  and  acknowledged,  each  TCP  car  ACK  the  FIN  it 
has  received.  Both  will,  upon  receiving  these  ACKs.  delete  the 
connection 
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TCP  A 


TCP  B 


1.  ESTABLISHED 


ESI ABLISHED 


2.  (Close) 

FIN-WAIT-1  — >  <SEQ=100XACK=30CXCTL  =  FIN,  ACK>  — >  CLOSE-WAIT 


3.  FIN-WAIT-2  <—  <SEQ=300XACK=1Q1XCTL=ACK> 


<--  CLOSE-WAIT 


(Close) 

TIME-WAIT  <--  <SEQ=300XACK=1Q1XCTL-F  IN , ACK>  <--  LAST-ACK 


5.  TIME-WAIT  -->  <SEQ=101XACK=3C1XCTL=ACK> 


— >  CLOSED 


6.  (2  MSL) 

CLOSED 


Normal  Close  Sequence 
Figure  13. 


TCP  A 


TCP  B 


ESTABLISHED 


ESTABLISHED 


(Close)  (Close) 

FIN-WAIT-1  —  >  <SEQ*100XACK=30QXCTL»FIN,ACK>  ...  FIN-WAIT-1 
<—  <SEQ*300XACK=10QXCTL3FIN ,  ACK>  <  — 

...  <SEQ*100XACK=>300XCTL  =  FIN,AC1C>  — > 


CLOSING  -->  <SEQ*101XACK=»301XCTL=ACK> 
<—  <SEQ=301XACK*101XCTL=ACK> 
...  <SEQ*101XACK=301XCTL=ACK> 


CLOSING 


TIME -WAIT 
(2  MSL) 
CLOSED 


TIME-WAIT 
(2  MSL) 
CLOSED 


Simultaneous  Close  Sequence 


Figure  14. ; 
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3.6.  Precedence  and  Security 

The  intent  is  that  connection  be  allowed  only  between  ports  operating 
with  exactly  the  same  security  and  compartment  values  and  at  the 
higher  of  the  precedence  level  requested  by  the  two  oorts. 

The  precedence  and  security  parameters  used  in  TCP  are  exactly  those 
defined  in  the  Internet  Protocol  (IP)  [2].  Throughout  this  TCP 
specification  the  term  "security/compartmer.t"  is  intended  to  indicate 
the  security  parameters  used  in  IP  including  security,  compartment, 
user  group,  and  handling  restriction. 

A  connection  attempt  with  mismatched  sacurity/ccmpartment  values  or  a 
lower  precedence  value  must  be  rejected  by  sending  a  reset.  Rejecting 
a  connection  due  to  too  low  a  precedence  only  occurs  after  an 
acknowledgment  of  the  SYN  has  been  received. 

Note  that  TCP  modules  which  operate  only  at  the  default  value  of 
precedence  will  still  have  to  check  the  precedence  of  incoming 
segments  and  possibly  raise  the  precedence  level  they  use  on  the 
connection. 


The  security  paramaters  may  be  used  even  in  a  non-secure  environment 
(the  values  would  indicate  unclassified  data),  thus  hosts  in 
non-secure  environments  must  b9  prepared  to  receive  the  secu  ity 
parameters,  though  they  need  not  send  them. 

3.7.  Data  Communication 


Once  the  connection  is  established  data  is  communicated  by  the 
exchange  of  segments.  Because  segments  may  be  lost  due  to  errors 
(checksum  test  failure),  or  network  congestion,  TCP  uses 
retransmission  (after  a  timeout)  to  ensure  delivery  of  every  segment. 
Duplicate  segments  may  arrive  due  to  network  or  TCP  retransmission. 

As  discussed  in  the  section  on  sequence  numbers  the  TCP  performs 
certain  tests  on  the  sequence  and  acknowledgment  numbers  in  the 
segments. to  verify  their  acceptability. 


The  sender  of  data  keeps  track  of  the  next  sequence  number  to  use  in 
the  variable  SND.NXT.  The  receiver  of  data  keeps  track  of  the  next 
sequence  number  to  expect  in  the  variable  RCV.NXT.  The  sender  of  data 
keeps  track  of  the  oldest  unacknowledged  sequence  number  in  the 
variable  SND.UNA.  If  the  data  flow  is  momentarily  idle  and  all  data 
sent  has  been  acknowledged  then  the  three  variables  will  he  eq'ual .. 


When  the  sender  creates  a  segment  and  transmits  it  the  sender  advances 
SND.NXT.:  When  the  receiver  accepts  a  segment  it  advances  RCV.NXT  and 
sends  an  acknowledgment.  When  tne  data  sender  receives  an 
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acknowledgment  it  advances  SNO.UNA.  The  extent  to  which  the  values  of 
these  variables  differ  is  a  measure  of  the  delay  in  the  communication. 
The  amount  by  which  the  variables  are  advanced  is  the  length  of  the 
data  in  tne  segment.  Note  that  once  in  the  ESTABLISHED  state  all 
segments  must  carry  current  acknowledgment  information. 

The  CLOSE  user  call  implies  a  push  function,  as  does  the  FiN  control 
flag  in  an  incoming  segment. 

Retransmission  Timeout 

Because  of  the  variability  of  the  networks  that  compose  an 
internetwork  system  and  the  wide  range  of  uses  of  TCP  connections  the 
retransmission  timeout  must  be  dynamically  determined.  One  procedure 
for  determining  a  retransmission  time  out  is  given  here  as  an 
illustration. 

An  Example  Retransmission  Timeout  Procedure 

Measure  the  elapsed  time  between  sending  a  data  octet  with  a 
particular  sequence  number  and  receiving  an  acknowledgment  that 
covers  that  sequence  number  (segments  sent  do  not  have  to  match 
segments  received).  This  measured  elapsed  time  is  the  Round  Trip 
Time  (RTT).  Next  compute  a  Smoothed  Round  Trip  Time  ( SRTT)  as: 

SRTT  =  (  ALPHA  •  SRTT  )  +  ((1-ALPHA)  •  RTT) 

3rd  based  on  this,  compute  the  retransmission  timeout  (RTO)  as: 

RTO  «  min[U80UND,max[LB0UND, (BETA*SRTT) ]] 

where  UBOUND  is  an  upper  bound  on  the  timeout  (e.g.,  1  minute), 
LBOUND  is  a  lower  bound  on  the  timeout  (e.g.,  1  second),  ALPHA  is 
a  smoothing  factor  (e.g.,  .8  to  .9),  and  BETA  is  a  delay  variance 
factor  (e.g.,  1.3  to  2.0). 

The  Communication  of  Urgent  Information 

The  objective  of  the  TCP  urgent  mechanism  is  to  allow  the  sending  user 
to  stimulate  the  receiving  user  to  accept  some  urgent  data  and  to 
permit  the  receiving  TCP  to  indicate  to  the  receiving  user  when  all 
the  currently  known  urgent  data  has  been  received  by  the  user. 

This  mechanism  permits  a  ocint  in  the  data  stream  to  be  designated  as 
the  end  of  urgent  information.  Whenever  this  point  is  in  advance  of 
the  receive  sequence  number  (RCV.NXT)  at  the  receiving  TCP,  that  TCP 
must  tell  the  user  to  go  into  "urgent  mode";  when  the  receive  sequence 
number  catches  up  to  the  urgent  pointer,  the  TCP  must  tell  user  to  go 


Transmission  Control  Protocol 
Functional  Specification 


September  1981 


into  "normal  mode".  If  the  urgent  pointer  is  updated  while  the  user 
is  in  "urgent  mode",  the  update  will  be  invisible  to  the  user. 

The  method  employs  a  urgent  field  which  is  carried  in  all  segments 
transmitted.  The  URG  control  flag  indicates  that  the  urgent  field  ’ s 
meaningful  and  must  be  added  to  the  segment  sequence  number  to  yield 
the  urgent  pointer.  The  absence  of  this  flag  indicates  that  there  is 
no  urgent  data  outstanding. 

To  send  an  urgent  indication  the  user  must  also  send  at  least  one  data 
octet.  If  the  sending  user  also  indicates  a  push,  timely  delivery  of 
the  urgent  information  to  the  destination  process  is  enhanced. 

Managinq  the  Window 

The  window  sent  in  each  segment  indicates  the  range  of  sequence 
numbers  the  sender  of  the  window  (the  data  receiver)  is  currently 
prepared  to  accept.  There  is  an  assumption  that  this  is  related  to 
the  currently  available  data  buffer  space  available  for  this 
connection . 


Indicating  a  large  window  encourages  transmissions.  If  more  data 
arrives  than  can  be  accepted,  it  w'll  be  discarded.  This  will  result 
in  excessive  retransmissions,  adding  unnecessarily  to  the  load  on  the 
network  and  the  TCPs.  Indicating  a  small  window  may  restrict  the 
transmission  of  data  to  the  point  of  introducing  a  round  trip  delay 
between  each  new  segment  transmitted. 

The  mechanisms  provided  allow  a  TCP  to  advertise  a  la;-gc  window  and  to 
subsequently  advertise  a  much  smaller  window  without  having  accepted 
that  much  data.  This,  so  called  "shrinking  the  window,"  is  strongly 
discouraged.  The  robustness  principle  dictates  that  TCPs  will  not 
shrink  the  window  themselves,  but  vill  be  prepared  for  such  behavior 
on  the  part  of  other  TCPs. 

The  sending  TCP  must  be  prepared  to  accept  from  the  user  and  send  at 
least  one  octet  of  new  data  even  if  the  send  window  is  zero,.  The 
sending  TCP  must  regularly  retransmit  to  the  receiving  TCP  even  when 
the  window  is  zero.  Two  minutes  is  recommended  for  the  retransmission 
interval  when  the  window  is  zero.  This  retransmission  is  essential  to 
guarantee  that  when  either  TCP  has  a  zero  window  the  re-opening  of  the 
window  will  be  reliably  reported  to  the  other. 

When  the  receiving  TCP  has  a  zero  window  and  a  segment  arrives  it  must 
still  send  an  acknowledgment  showing  its  next  expected  sequence  number 
and  current  window  (zero). 

The  sending  TCP  packages  the  data  to  be  transmitted  inxo  segments 
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which  fit  the  current  window,  and  may  repackage  segments  on  the 
retransmission  queue.  Such  repackaging  is  not  required,  but  may  be 
helpful . 

In  a  connection  with  a  one-way  data  flow,  the  window  information  will 
be  carried  in  acknowledgment  segments  that  all  have  the  same  sequence 
number  so  there  will  be  no  way  to  reorder  them  if  tney  a;  rive  out  of 
order.  This  is  not  a  serious  problem,  but  it  will  allow  the  window 
information  to  be  on  occasion  temporarily  based  on  old  reports  from 
the  data  receiver,  A  refinement  to  avoid  this  problem  is  to  act  on 
the  window  information  from  segments  that  carry  the  highest 
acknowledgment  number  (that  is  segmen  with  acknowledgment  number 
equal  or  greater  than  the  highest  previously  received). 

The  window  management  procedure  has  significant  influence  on  the 
communication  performance.  The  following  comments  are  suggestions  to 
implementors . 

Window  Management  Suggestions 

A'1  locating  a  very  small  window  causes  data  to  be  transmitted  in 
many  small  segments  when  better  performance  is  achieved  using 
fewer  large  segments. 

One  suggestion  for  avoiding  small  windows  is  for  the  receiver  to 
defer  updating  a  window  until  the  additional  allocation  is  at 
least  X  percent  of  the  maximum  allocation  possible  for  the 
connection  (where  X  might  be  20  to  40), 

Another  suggestion  is  for  the  sender  to  avoid  sending  small 
segments  by  waiting  until  the  window  is  large  enough  before 
sending  data.  If  the  the  user  signals  a  push  function  then  the 
data  must  be  sent  even  if  it  is  a  small  segment. 

Note  that  the  acknowledgments  should  not  be  delayed  or  unnecessary 
retransmissions  will  result.  One  strategy  would  be  to  send  an 
acknowledgment  when  a  small  segment  arrives  (with  out  updating  the 
window  information),  and  then  to  send  another  acknowledgment  with 
new  window  information  when  the  window  is  larger. 

The  segment  sent  to  probe  a  zero  window  may  also  begin  a  break  up 
of  transmitted  data  into  smaller  and  smaller  segments.  If  a 
segment  containing  a  single  data  octet  sent  to  probe  a  zero  window 
is  accepted,  it  consumes  one  octet  of  the  window  now  available. 

If  the  sending  TCP  simply  sends  as  much  as  it  can  whenever  the 
window  is  non  zero,  the  transmitted  data  will  be  broken  into 
alternating  big  and  small  segments.  As  time  goes  on,  occasional 
oauses  in  the  receiver  making  window  allocation  available  will 
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result  in  breaking  the  big  segments  into  a  small  and  not  quite  so 
big  pair.  And  after  a  wnile  the  data  transmission  wil1  be  in 
mostly  small  segments. 

The  suggestion  here  is  that  the  TCP  implementations  neeo  to 
actively  attempt  to  combine  small  window  allocations  into  larger 
windows,  since  the  mechanisms  for  managing  the  window  tend  to  lead 
to  many  small  windows  in  the  simplest  minded  implementations. 

3.8.  Interfaces 

There  are  of  course  two  interfaces  of  concern:  the  user/TCP  interface 
and  the  TCP/lower-level  interface.  We  have  a  fairly  elaborate  model 
of  the  user/TCP  interface,  but  the  interface  to  the  lower  level 
protocol  module  is  left  unspecified  here,  since  it  will  be  specified 
in  detail  by  the  specification  of  the  Towel  level  protocol.  For  the 
case  that  the  lower  level  is  IP  we  note  some  of  the  parameter  values 
that  TCPs  might  use. 

User/TCP  Interface 

The  following  functional  description  of  user  commands  to  the  TCP  is. 
at  best,  fictional,  since  every  operating  system  will  have  different 
facilities.  Consequently,  we  must  warn  readers  that  different  TCP 
implementations  may  have  different  user  interfaces.  However,  all 
TCPs  must  provide  a  certain  minimum  set  of  services  to  guarantee 
that  all  TCP  implementations  can  support  the  same  protocol 
hierarchy.  This  section  specifies  the  functional  interfaces 
required  of  all  TCP  implementations. 

TCP  User  Commands 

The  following  sections  functionally  characterize  a  USER/TCP 
interface.  The  notation  used  is  similar  to  most  procedure  or 
function  calls  in  high  level  languages,  but  this  usage  is  not 
meant  to  rule  out  trap  type  service  calls  ( e . g . .  SVCs,  UUOs, 

EMTs). 

The  user  commands  described  below  specify  the  basic  functions  the 
TCP  must  perform  to  support  interprocess  communication. 

Individual  implementations  must  define  their  own  exact  format,  and 
may  provide  combinations  or  subsets  of  the  basic  functions  in 
single  calls..  In  particular,  some  implementations  may  wish  to 
automatically  OPEN  a  connection  on  the  first  SEND  or  RECEIVE 
.ssued  by  the  user  for  a  given  connection.. 


I 


I 


» 


i 


9 


I 


9 


ft 


[Page  44] 


v 


September  1981 


Transmission  Control  Protocol 
Functional  Specification 


I 


In  providing  interprocess  communication  facilities,  the  TCP  must  j 

not  only  accept  commands,  but  must  also  return  information  to  the 
processes  it  serves.  The  latter  consists  of: 

(a)  general  information  about  a  connection  (e.g.,  interrupts, 
remote  close,  binding  of  unspecified  foreign  socket). 

I 

(b)  replies  to  specific  user  commands  indicating  success  or 
various  types  of  failure. 

Open 

Format:  OPEN  (loca.  port,  foreign  socket,  active/passive  §- 

[,  timeout]  [,  precedence]  [,  security/compartment]  [,  options]) 

->  local  connection  name 

We  assume  that  the  local  TCP  is  aware  of  the  identity  of  the 

processes  it  serves  and  will  check  the  authority  of  the  process 

to  use  the  connection  specified.  Depending  upon  the  I 

implementation  of  the  TCP.  the  local  network  and  TCP  identifiers 

for  the  source  address  will  either  be  supplied  by  the  TCP  or  the 

lower  level  protocol  (e.g.,  IP).  These  considerations  are  the 

result  of  concern  about  security,  to  the  extent  that  no  TCP  be 

able  to  masquerade  as  another  one,  and  so  on.  Similarly,  no 

process  can  masquerade  as  another  without  the  collusion  of  the  » 

TCP. 

If  the  active/passive  flag  is  set  to  passive,  then  this  is  a 

call  to  LISTEN  for  an  incoming  connection.  A  passive  open  may 

have  either  a  fully  specified  foreign  socket  to  wait  for  a 

particular  connection  or  an  unspecified  foreign  socket  to.  wait  * 

for  any  call.  A  fully  specified  passive  call  can  be  made  active 

by  the  subsequent  execution  of  a  SEND. 

A  transmission  control  block  ( TCB )  is  created  and  partially 

filled  in  with  data  from  the  OPEN  command  parameters.  ^ 

On  an  active  OPEN  command,  the  TCP  will  begin  the  procedure  to 
synchronize  (i.e.,  establish)  the  connection  at  once. 

The  timeout,  if  present,  permits  the  caller  to  set  up  a  timeout 
for  all  data  submitted  to  TCP.  If  data  is  not  successfully 

p 

delivered  to  the  destination  within  the  timeout  period,  the  TCP 
will  abort  the  connection.  The  present  global  default  is  five 
minutes. 

The  TCP  or  some  component  of  the  operating  system  will  verify 
thu  users  authority  to  open  a  connection  with  the  specifies 
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precedence  or  security/compartment.  The  absence  of  precedence 
or  security/compartment  specification  in  the  OPEN  call  indicates 
the  default  values  must  be  used. 

TCP  will  accept  incoming  requests  as  matching  only  if  the 
securi ty/compartment  information  is  exactly  the  same  and  only  if 
the  precedence  is  equal  to  cr  higher  than  the  precedence 
requested  in  the  OPEN  call. 

The  precedence  for  the  connection  is  the  higher  of  the  values 
requested  in  the  OPEN  call  and  received  from  the  incoming 
request,  ana  fixed  at  that  value  for  the  life  of  the 
connection. Imp! ementers  may  want  to  give  the  user  control  of 
this  precedence  negotiation.  For  example,  the  user  might  be 
allowed  to  spjcify  that  the  precedence  must  be  exactly  matched, 
or  that  any  attempt  to  raise  the  precedence  be  confirmed  by  the 
user. 

A  local  connection  name  will  be  returned  to  the  user  by  the  TCP. 
The  local  connection  name  can  the"  be  used  as  a  short  hand  term 
for  the  connection  defined  by  the  <local  socket,  foreign  socket> 
pair. 

Send 

Fermat:  SEND  (local  connection  name,  buffer  address,  byte 
count,  PUSH  flag,  URGENT  flag  [.timeout]) 

This  call  causes  the  data  contained  in  the  indicated  user  buffer 
to  be  sent  on  the  indicated  connection.  If  the  connection  has 
not  been  opened,  the  SENO  is  considered  an  error.  Seme 
implementations  may  allow  users  to  SEND  first;  in  which  case,  -in 
automatic  OPEN  would  be  done.  If  the  calling  process  is  not 
authorized  to  use  this  connection,  an  error  is  returned. 

If  the  p,JSH  flag  is  set,  the  data  must  be  transmitted  promptly 
to  the  receiver,  and  the  PUSH  bit  will  be  set  in  the  last  TCP 
segment  created  from  the  buffer.  If  the  PUSH  flag  is  net  set, 
the  data  may  be  combined  with  data  from  subseouent  SENOs  for 
transmission  efficiency 

If  the  URGENT  flag  is  set,  segments  sent  to  the  destination  TCP 
will  have  the  urgent  pointer  set.  The  receiving  TCP  will  signal 
the  urgent  condition  to  the  receiving  process  if  the  urgent 
pointer  indicates  that  data  preceding  the  urgent  pointer  has  not 
been  consumed  by  the  receiving  process.  The  purpose  of  urgent 
is  to  stimulate  the  receiver  to  process  the  urgent  data  and  to 
indicate  to  tne  receiver  when  all  the  currently  known  urgent 
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data  has  Deen  received.  The  number  of  times  the  sending  user's 
TCP  signals  urgent  will  not  necessarily  be  equal  to  the  number 
of  times  the  receiving  user  will  be  notified  of  the  presence  of 
urgent  data. 

If  no  foreign  socket  was  specified  in  the  OPEN,  but  the 
connection  is  established  (e.g.,  because  a  LISTENing  connection 
has  become  specific  due  to  a  foreign  segment  arriving  for  the 
local  socket),  then  the  designated  buffer  is  sent  to  the  implied 
foreign  socket.  Users  who  make  use  of  OPEN  with  an  unspecified 
foreign  socket  can  make  use  of  SEND  without  ever  explicitly 
knowing  the  foreign  socket  address. 

However,  if  a  SEND  is  attempted  before  the  foreign  socket 
becomes  specified,  an  error  will  be  returned.  Users  can  use  the 
STATUS  call  to  determine  the  status  of  the  connection.  In  some 
implementations  the  TCP  may  notify  the  user  when  an  unspecified 
socket  is  bound. 

If  a  timeout  is  specified,  the  current  user  timeout  for  this 
connection  is  changed  to  the  new  one. 

In  the  simplest  implementation.  SEND  would  not  return  control  to 
the  sending  process  until  either  the  transmission  was  complete 
or  the  timeout  hao  been  exceeded.  However,  this  simple  method 
is  both  subject  to  deadlocks  (for  example,  both  sides  of  the 
connection  might  try  to  do  SENjs  before  doing  any  RECEIVES)  and 
offers  poor  performance,  so  it  is  not  recommended.  A  more 
sophisticated  implementation  would  return  immediately  to  allow 
the  process  to  run  concurrently  with  network  I/O,  and, 
furthermore,  to  allow  multiple  SENDs  to  be  in  progress. 

Multiple  SENDs  are  served  in  first  come,  first  served  order,  so 
the  TCP  will  queue  those  it  cannot  service  immediately. 

We  have  implicitly  assumed  an  asynchronous  user  interface  in 
which  a  SENO  later  elicits  some  kind  of  SIGNAL  or 
pseudo-interrupt  from  the  serving  TCP.  An  alternative  is  to 
rstirn  a  response  immediately.  For  instance,  SENDs  might  return 
immediate  local  ack^owl edgment,  even  if  the  seamsnt  sent  had  not 
been  acknowledged  by  the  distant  TCP.  We  could  optimistically 
assume  eventual  success.  If  we  are  wrong,  the  connection  will 
close  anyway  due  to  the  timeout.  In  implementations  of  this 
kind  (synchronous),  there  will  still  be  some  asynchronous 
signals,  but  these  will  deal  with  the  connection  itself,  and  not 
with  specific  segments  or  buffers. 

In  order  for  the  process  to  distinguish  among  error  or  success 
indications  for  different  SENDs,  it  might  be  appropriate  for  the 
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buffer  address  to  be  returned  along  with  the  coded  response  to 
the  SEND  request.  TCP-to-user  signals  are  discussed  below, 
indicating  the  information  which  should  be  returned  to  the 
calling  process. 

Receive 

Format:  RECEIVE  (local  connection  name,  buffer  address,  byte 
count)  ->  byte  count,  urgent  flag,  push  flag 

This  command  allocates  a  receiving  buffer  associated  with  the 
specified  connection.  If  no  OPEN  precedes  this  cownand  o «•  the  r 

calling  process  is  not  authorized  to  use  this  connection,  an 
error  is  returned. 

In  the  simplest  implementation,  control  would  not  return  to  the 

calling  program  until  either  the  buffer  was  filled,  or  some 

error  occurred,  but  this  scheme  is  highly  subject  to  deadlocks.  i 

A  more  sophisticated  implementation  would  permit  several 

RECEIVES  to  be  outstanding  at  once.  These  would  be  filled  as 

segments  arrive.  This  strategy  permits  increased  throughput  at 

the  cost  of  a  more  elaborate  scheme  (possibly  asynchronous)  to 

notify  the  calling  program  that  a  PUSH  has  been  seen  or  a  buffer 

filled.  i 

If  enough  data  arrive  to  fill  the  buffer  before  a  PUSH  is  seen, 
the  PUSH  flag  will  not  be  set  in  the  response  to  the  RECEIVE. 

The  buffer  will  be  filled  with  as  much  data  as  it  can  hold.  If 

a  PUSH  is  se6n  before  the  buffer  is  filled  the  buffer  will  be 

returned  partially  filled  and  PUSH  indicated.  » 

If  there  is  urgent  data  the  user  will  have  been  informed  as  soon 

as  it  arrived  via  a  TCP-to-user  signal.  The  receiving  user 

should  thus  be  in  "urgent  mode".  If  the  URGENT  flag  is  on, 

additional  urgent  data  remains.  If  the  URGENT  flag  is  off,  this 

call  to  RECEIVE  has  returned  all  the  urgent  data,  and  the  user  ► 

may  now  leave  "urgent  mode".  Note  that  data  following  the 

urgent  pointer  (non-urgent  data)  cannot  be  delivered  to  the  user 

in  the  same  buffer  with  proceeding  urgent  data  unless  the 

boundary  is  clearly  marked  for  the  user. 

To  distinguish  among  several  outstanding  RECEIVES  and  to  take  ' 

care  of  the  case  that  a  buffer  is  not  completely  filled,  the 
return  code  is  accompanied  by  both  a  buffer  pointer  and  a  Dyte 
count  indicating  the  actual  length  of  the  data  received. 

Alternative  implementations  of  RECEIVE  might  have  the  TCP 
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allocate  buffer  storage,  or  the  TCP  might  share  a  ring  buffer 
with  the  user. 

Close 

Format:  CLOSE  (local  connection  name) 

This  command  causes  the  connection  specified  to  be  closed.  If 
the  connection  is  not  open  or  the  calling  process  is  not 
authorized  to  use  this  connection,  an  error  is  returned. 

Closing  connections  is  intended  to  be  a  graceful  operation  in 
the  sense  that  outstanding  SENDs  will  be  transmitted  (and 
retransmitted),  as  flow  control  permits,  until  all  have  been 
serviced.  Thus,  it  should  be  acceptable  to  make  several  SEND 
calls,  followed  by  a  CLOSE,  and  expect  all  the  data  to  be  sent 
to  the  destination.  It  should  also  be  clear  that  users  should 
continue  to  RECEIVE  on  CLOSING  connections,  since  the  ether  side 
may  be  trying  to  transmit  the  last  of  its  data.  Thus,  CLOSE 
means  "I  have  no  more  to  send"  but  does  not  mean  "I  will  not 
receive  any  more."  It  may  happen  (if  the  user  level  protocol  is 
not  well  thought  out)  that  the  closing  side  is  unable  to  get  rid 
of  all  its  data  before  timing  out.  In  this  event,  CLOSE  turns 
into  ABORT,  and  the  closing  TCP  gives  up. 

The  user  may  CLOSE  the  connection  at  any  time  on  his  own 
initiative,  or  in  response  to  various  prompts  from  the  TCP 
(e.g.,  remote  close  executed,  transmission  timeout  exceeded, 
destination  inaccessible). 

Because  closing  a  connection  requires  communication  with  the 
foreign  TCP.  connections  may  remain  in  the  closing  state  for  a 
short  time.  Attempts  to  reopen  the  connection  before  the  TCP 
replies  to  the  CLOSE  command  will  result  in  error  responses. 

Close  also  implies  push  function. 

Status 

Format:  STATUS  (local  connection  name)  ->  status  data 

This  is  an  implementation  dependent  user  command  and  could  be 
excluded  without  adverse  effect.  Information  returned  would 
typically  come  from  the  TCB  associated  with  the  connection. 

This  command  returns  a  data  block  containing  the  following 
information : 


local  socket. 


[Page  49] 


September  1981 

Transmission  Control  Protocol 
Functional  Specification 


foreign  socket, 
local  connection  name, 
receive  window, 
send  window, 
connection  state, 

number  of  buffers  awaiting  acknowledgment, 

number  of  buffers  pending  receipt. 

urgent  state. 

precedence, 

security/compartment, 

and  transmission  timeout. 

Depending  on  the  state  of  the  connection,  or  on  the 
implementation  itself,  some  of  this  information  may  not  be 
available  or  meaningful.  If  the  calling  process  is  not 
authorized  to  use  this  connection,  an  error  is  returned.  This 
prevents  unauthorized  processes  from  gaining  information  about  a 
connection. 


Abort 

Format:  ABORT  (local  connection  name) 

This  command  causes  all  pending  SENDS  and  RECEIVES  to  be 
aborted,  t.ha  TCB  to  be  removed,  and  a  special  RESET  message  to 
be  sent  to  the  TCP  on  the  other  side  of  the  connection. 

Depending  on  the  implementation,  users  may  receive  abort 
indications  for  each  outstanding  SEND  or  RECEIVE,  or  may  simply 
receive  an  ABORT-acknowledgment. 

TCP-to-User  Messages 

It  is  assumed  that  the  operating  system  environment  provides  a 
means  for  the  TCP  to  asynchronously  signal  the  user  program.  When 
the  TCP  does  signal  a  user  program,  certain  information  is  passed 
to  the  user.  Often  in  the  specification  the  information  will  be 
an  error  message.  In  other  cases  there  will  be  information 
relating  to  the  completion  of  processing  a  SEND  or  RECEIVE  or 
other  user  cal  1 . 


The  following  information  is  provided: 

Local  Connection  Name 
Response  String 
Buffer  Address 

Byte  count  (counts  bytes  received) 
Push  flag 
Urgent  flag 


Always 

Always 

Send  &  Receive 
Receive 
Receive 
Receive 
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TCP/Lower-Level  Interface 

The  TCP  calls  on  a  lower  level  protocol  modu' e  to  actually  send  and 
receive  information  over  a  network.  One  case  is  that  of  the  ARPA 
internetwork  system  where  the  lower  level  module  is  the  Internet 
Protocol  (IP)  [2]. 

If  the  lower  level  protocol  is  IP  it  provides  arguments  for  a  type 
of  service  and  for  a  time  to  live.  TCP  uses  the  following  settings 
for  these  parameters: 

Type  of  Service  *  Precedence:  routine.  Delay:  normal.  Throughput: 
normal.  Reliability:  normal;  or  00000000. 

Time  to  Live  *  one  minute,  or  00111100. 

Note  that  the  assumed  maximum  segment  lifetime  is  two  minutes. 
Here  we  explicitly  ask  that  a  segment  be  destroyed  if  it  cannot 
be  delivered  by  the  internet  system  within  one  minute. 

If  the  lower  level  is  IP  (or  other  protocol  that  provides  this 
feature)  and  source  routing  is  used,  the  interface  must  allow  the 
route  information  to  be  communicated.  This  is  especially  important 
so  that  the  source  and  destination  addresses  used  in  the  TCP 
checksum  be  the  originating  source  and  ultimate  destination.  It  is 
also  important  to  preserve  the  return  route  to  answer  connection 
requests . 

Any  lower  level  protocol  will  have  to  provide  the  source  address, 
destination  address,  and  protocol  fields,  and  some  way  to  determine 
the  "TCP  length",  both  to  provide  the  functional  equivlent  sarvice 
of  IP  and  to  be  used  in  the  TCP  checksum. 
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3.9.  Event  Processing 

The  processing  depicted  in  this  section  is  an  example  of  one  possible 
implementation.  Other  implementations  may  have  slightly  different 
processing  sequences,  but  they  should  differ  from  those  in  this 
section  only  in  detail,  not  in  substance. 

The  activity  of  the  TCP  can  be  characterized  as  responding  to  events. 
The  events  that  occur  can  be  cast  into  three  categories:  user  calls, 
arriving  segments,  and  ‘imeouts.  This  section  describes  the 
processing  the  TCP  does  in  response  to  each  of  the  events.  In  many 
cases  the  processing  required  depends  on  the  state  of  the  connection. 

Events  that  occur: 

User  Calls 

OPEN 

SEND 

RECEIVE 

CLOSE 

ABORT 

STATUS 

Arriving  Segments 

SEGMENT  ARRIVES 

T imeouts 

USER  TIMEOUT 
RETRANSMISSION  TIMEOUT 
TIME-t-AIT  TIMEOUT 

The  model  of  the  TCP/user  interface  is  that  user  commands  receive  an 
immediate  return  and  possibly  a  delayed  response  via  an  event  or 
pseudo  interrupt.  In  the  following  descriptions,  the  term  "signal" 
means  cause  a  delayed  response. 

Error  responses  are  given  as  character  strings.  For  example,  user 
commands  referencing  connections  that  do  not  exist  receive  "error: 
connection  not  open". 

Please  note  in  the  following  that  all  arithmetic  on  sequence  numbers, 
acknowledgment  numbers,  windows,  et  cetera,  is  modulo  2“*32  the  size 
of  the  sequence  number  space.  Also  note  that  "=<"  means  less  than  or 
equal  to  (mouulo  2 •  •  3 2 ) -. 
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A  natural  way  to  think  about  processing 
imagine  that  they  are  first  tested  for 
that  their  contents  lie  in  the  range  of 
in  the  sequence  number  space)  and  then 
and  processed  in  sequence  number  order. 

incoming  segments  ’S  to 
proper  sequence  number  (i.e., 
the  expected  "receive  window" 
that  they  are  generally  queued 

• 

When  a  segment  overlaps  other  already  received  segments  we  reconstruct 
the  segment  to  contain  jusc  the  new  data,  and  adjust  the  header  fields 
to  be  consistent. 

• 

* 

Note  that  if  no  state  change  is  mentioned  the  TCP  stays  in  the  same 
state. 

• 
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OPEN  Call 


OPEN  Call 

CLOSED  STATE  (i.e.,  TC8  does  not  exist) 

Create  a  new  transmission  control  block  (TCB)  to  hold  connection 
state  information.  Fill  in  local  socket  identifier,  foreign 
socket,  precedence,  securi ty/compartment,  end  user  timeout 
information.  Note  that  some  parts  of  the  foreign  socket  may  be 
unspecified  in  a  passive  OPEN  and  are  to  be  filled  in  by  the 
parameters  of  the  incoming  SYN  segment.  Verify  the  security  and 
precedence  requesteo  are  allowed  for  this  user,  if  not  return 
"error:  precedence  not  allowed*’  or  "error:  security/compartment 
not  allowed."  If  passive  enter  the  LISTEN  state  and  return.  If 
active  and  the  foreign  socket  is  unspecified,  retain  "error: 
foreign  socket  unspecified";  if  active  and  the  foreign  socket  is 
specified  issue  a  SYN  segment.  An  initial  send  sequence  number 
(ISS)  is  selected.  A  SYN  segment  of  the  form  <SEQ=ISSXCTL=tSYN> 
is  sent.  Set  SND.UNA  to  ISS,  SND.NXT  to  ISS+1,  enter  SYN-SENT 
state,  and  return. 

If  the  caller  does  not  have  access  to  the  local  socket  specified, 
return  "error:  connection  illegal  for  this  process".  If  there  is 
ro  room  to  create  a  new  connection,  return  "error:  insufficient 
resources" 

LISTEN  STATE 

If  active  and  the  foreign  socket  is  specified,  then  change  the 
connection  from  passive  to  active,  select  an  ISS.  Send  a  SYN 
segment,  set  SND.UNA  to  ISS,  SND.NXT  to  ISS+1.  Enter  SYN-SENT 
state.  Data  associated  with  SEND  may  be  sent  with  SYN  segment  or 
queued  for  transmission  after  entering  ESTABLISHED  state.  The 
urgent  bit  if  requested  in  the  command  must  be  sent  with  the  data 
segments  sent  as  a  result  of  this  command.  If  there  is  no  room  to 
queue  the  request,  respond  with  "error:  insufficient  resources". 
If  Foreign  socket  was  not  specified,  then  return  "error:  foreign 
socket  unspecified" 
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OPEN  Call 


SYN-SEN  i‘  STATE 
SYN-RECEIVFD  STATE 
ESTABLISHED  STATE 
FIN-WAIT-1  STATE 
FIN-WAIT-2  STATE 
CLOSE-WAIT  STATE 
CLOSING  STATE 
LAST-ACK  STATE 
TIME-WAIT  STATE 


Return  "error:  connection  already  exists". 
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SEND  Call 


SEND  Call 

CLOSED  STATE  (i.e.,  TCB  does  not  exist) 

If  the  user  does  not  have  access  to  such  a  connection,  then  return 
"error:  connection  illegal  for  this  process",. 

Otherwise,  return  "error:  connection  does  not  exist". 

LISTEN  STATE 

If  the  foreign  socket  is  specified,  then  change  the  connection 
from  passive  to  active,  select  an  ISS.  Send  a  SYN  segment,  sat 
SND.UNA  to  ISS.  SND.NXT  to  ISS+1.  Enter  SYN-SENT  state.  Data 
associated  with  SENO  may  be  sent  with  SYN  segment  or  queued  for 
transmission  after  entering  ESTABLISHED  state.  The  urgent  bit  if 
requested  in  the  command  must  be  sent  with  the  data  segments  sent 
as  a  result  of  this  command.  If  there  is  no  room  to  queue  the' 
request,  respond  with  "error:  insufficient  resources".  If 
Foreign  socket  was  not  specified,  then  return  "error:  foreign 
socket  unspecified". 

SYN-SENT  STATE 
SYN-RECEIVED  STATE 

Queue  the  data  for  transmission  after  entering  ESTABLISHED  state. 
If  no  space  to  queue,  respond  with  "error:  insufficient 
resources" . 

ESTABLISHED  STATE 
CLOSE-WAIT  STATE 

Segmentize  the  buffer  and  send  it  with  a  piggybacked 
acknowledgment  (acknowledgment  value  *  RCV.NXT).  If  there  is 
insufficient  space  to  remember  this  buffer,  simply  return  "error: 
insufficient  resources" 

If  the  urgent  flag  is  set,  then  SND.UP  <-  SND.NXT-1  and  set  the 
urgent  pointer  in  the  outgoing  segments. 


* 
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SEND  Call 


FIN-WAIT-1  STATE 
FIN-WAIT-2  STATE 
CLOSING  STATE 
LAST-ACK  STATE 
TIME-WAIT  STATE 

Return  "error:  connection  closing"  and  do  not  service  request. 


w 
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RECEIVE  Call 


RECEIVE  Call 


CLOSED  STATE  (i.e.,  TCS  does  not  ex^st) 


If  the  user  does  not  have  access  to  such  a  connection 
error:  connection  illegal  for  this  process". 


return 


Otherwise  return  "sr-ror:  connection  do**1;  not  exist". 

LISTEN  STATE 
SYN-SENT  STATE 
SYN-RECEIVED  STATE 


Queue  for  processing  after  entering  ESTABLISHED  state.  If  there 
is  no  room  to  queue  this  request,  respond  with  '.?rror; 
insufficient  resources". 

ESTABLISHED  STATE 
FIN-WAIT-1  STATE 
FIN-WAIT-2  STATE 


If  insufficient  incoming  segments  are  queued  to  satii'v 
request,  queue  the  request.  If  there  is  *o  queue  space 
remember  the  RECEIVE,  respond  with  "error-  ;•>*<•♦*  ident 
resources". 


the 

to 


Reassemble  queued  incoming 
to  user.  Mark  "push  seen" 


segment  into  receive  buf:-.-r 
if  this  i s  tne  case. 


and  return 


If  RCV.UP  is  in 
user  notify  the 


advance  of  the  data  currently  being  passed  to  the 
user  of  the  presence  of  urgent  data. 


When  the  TCP  takes  responsibility 
that  fact  must  be  communicated  to 
acknowledgment.  The  formation  of 
described  below  in  the  discussion 
segment. 


for  delivering  data  to  the  user 
the  sender  via  an 
such  an  acknowledgment  is 
of  processing  an  incoming 
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» 

CLOSE-WAIT  STATE 

Since  the  remote  side  has  already  sent  FIN.  RECEIVES  must  be 

satisfied  by  text  already  on  hand,  but  not  yet  delivered  to  the 

user.  If  no  text  is  awaiting  delivery,  the  RECEIVE  will  get  a 

"9rror:  connection  closing"  response.  Otherwise,  any  remaining  ® 

text  can  be  used  to  satisfy  the  RECEIVE. 

CLOSING  STATE 
LAST-ACK  STATE 
TIME-WAIT  STATE 

Return  "error:  connection  closing". 
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CLOSE  Call 


CLOSE  Call 

CLOSED  STATE  (i.e.,  TCB  does  not  exist) 

If  the  user  does  not  have  access  to  such  a  connection,  return 
"error:  connection  illegal  for  this  process". 

Otherwise,  return  "error:  connection  does  not  exist". 

LISTEN  STATE 

Any  outstanding  RECEIVES  are  returned  with  "error:  closing" 
responses.  Delete  TCS,  enter  CLOSED  state,  and  return. 

SYN-SENT  STATE 

Delete  the  TCB  and  return  "error:  closing"  responses  to  any 
queued  SENDs,  or  RECEIVES. 

SYN-RECEIVED  STATE 

If  no  SENDs  have  been  issued  and  there  is  no  pending  data  to  send, 
then  form  a  FIN  segment  and  send  it,  and  enter  FIN-WAIT-i  state; 
otherwise  queue  for  processing  after  entering  ESTABLISHED  state. 

ESTABLISHED  STATE 

Queue  this  until  all  preceding  SENDs  have  been  segmentized,  then 
form  a  FIN  segment  and  sen''  it.  In  any  case,  enter  FIN-WAIT-1 
state. 

FIN-WAIT-1  STATE 
FIN-WAIT-2  STATE 

Strictly  speaking,  this  is  an  error  and  should  receive  a  "error: 
connection  closing"  response.  An  "ok"  response  would  be 
acceptable,  too,  as  long  as  a  second  FIN  is  not  emitted  (the  first 
FIN  may  be  retransmitted  though). 
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CLOSE  Call 


CLOSE-WAIT  STATE 

Queue  this  request  until  all  preceding  SENDS  have  been 
segmentized;  then  send  a  FIN  segment,  enter  CLOSING  state. 

CLOSING  STATE 
LAST-ACK  STATE 
TIME-WAIT  STATE 

Respond  with  ‘error:  connection  closing”. 
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ABORT  Call 


ABORT  Call 

CLOSED  STATE  (i.e„,  TCB  does  not  oxist) 

If  the  user  should  not  have  access  to  such  a  connection,  return 
"error;  connection  illegal  for  this  process". 

Otherwise  return  "error:  connection  does  not  exist". 

LISTEN  STATE 

Any  outstanding  RECEIVES  should  be  returned  with  "error: 
connection  reset"  responses.  Delete  TCB,  enter  CLOSED  state,  and 
return. 

SYN-SENT  STATE 

All  queued  SENDs  and  RECEIVES  should  be  given  "connection  reset" 
notification,  delete  the  TCB,  enter  CLOSED  state,  and  return.. 

SYN-RECEIVED  STATE 
ESTABLISHED  STATE 
FIN-WAIT-1  STATE 
FIN-WAIT-2  STATE 
CLOSE -WAIT  STATE 

Send  a  reset  segment: 

<SEQ=SND .  NXTXCTL»RST> 

All  queued  SENDs  and  RECEIVES  should  be  given  "connection  reset" 
notification:  all  segments  queued  for  transmission  (except  for  the 
RST  formed  above)  or  retransmission  should  be  flushed,  delete  the 
TCB,  enter  CLOSED  state,  and  return. 

CLOSING  STATE 
LAST-ACK  STATE 
TIME-WAIT  STATE 

Respond  with  "ok"  and  aelete  the  TCB,  enter  CLOSED  state,  and 
return. 


» 


*' 


•  ' 
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STATUS  Call 


STATUS  Call 

CLOSED  STATE  (i.e.,  TCB  does  not  exist) 

If  the  user  should  not  have  access  to  such  a  connection,  return 
"error:  connection  illegal  for  this  process'*. 

Otherwise  return  “error:  connection  does  not  exist". 


LISTEN  STATE 
Return  "state  * 
SYN-SENT  STATE 
Return  "state  * 
SYN-RECEIVED  STATE 
Return  "state  = 
ESTABLISHED  STATE 
Return  "state  * 
FIN-WAIT-1  STATE 
Return  "state  * 
FIN-WAIT-2  STATE 
Return  "state  * 
CLOSE-WAIT  STATE 
Return  "state  a 
CLOSING  STATE 
Return  "state  = 
LAST-ACK  STATE 
Return  "state  * 


LISTEN",  and  the  TCB  pointer. 


SYN-SENT",  and  the  TCB  pointer. 


SYN-RECEIVED",  and  the  TCB  pointer. 


ESTABLISHED",  and  the  TCB  pointer. 


FIN-WAIT-1",  and  the  TCB  pointer. 


FIN-WAIT-2",  and  the  TCB  pointer. 


CLOSE-WAIT",  and  the  TCB  pointer. 


CLOSING",  and  the  TCB  pointer. 


LAST-ACK",  and  the  TCB  pointer.. 


ft 


ft 
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time-wait 

Return 


STATUS  Call 


STATE 

state  *  TIME-WAIT",  and  the  TC8  pointer. 


I 


r 
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SEGMENT  ARRIVES 


SEGMENT  ARRIVES 

If  the  state  is  CLOSED  (i.e.,  TCB  does  not  axist)  then 

all  data  in  the  incoming  segment  is  discarded.  An  incoming 
segment  containing  a  RST  is  discarded.  An  incoming  segment  not 
containing  a  RST  causes  a  RST  to  be  sent  in  response.  The 
acknowledgment  and  sequence  field  values  are  selected  to  make  the 
reset  sequence  acceptable  to  the  TCP  that  sent  the  offending 
segment. 

If  the  ACK  bit  is  off,  sequence  number  zero  is  used, 
<SEQ=OXACK*SEG .  SEQ+SEG .  LENXCTL=RST  ,ACK> 

If  the  ACK  bit  is  on, 

<SEQsSEG .  ACKXCTL*RST> 

Return 

If  the  state  is  LISTEN  then 
first  check  for  an  RST 

An  incoming  RST  should  be  ignored.  Return, 
second  check  for  an  ACK 

Any  acknowledgment  is  bad  if  it  arrives  on  a  connection  still  in 
the  LISTEN  state.  An  acceptable  reset  segment  should  be  formed 
for  any  arriving  ACK-bearing  segment.  The  RST  shou’d  be 
formatted  as  follows: 

<SEQ=SEG.ACKXCTL=RST> 

Return . 

third  check  for  a  SYN 

If  the  SYN  bit  is  set,  check  the  security,.  If  the 
security/compartment  on  the  incoming  segment  does  not  exactly 
match  the  security/compartment  in  the  TCB  then  send  a  reset  and 
return . 

<SEQ=SEG .  ACKXCTL=RST> 


I 


I 


»' 
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If  the  SEG.PRC  is  greater  than  the  TCB.PRC  then  if  allowed  by 
the  user  and  the  system  set  TC8.PRC<-SEG.PRC,  if  not  allowed 
send  a  reset  and  return. 

<S£Q=SEG.  ACKXCTL=RST> 

If  the  SEG.PRC  is  less  than  the  TCB.PRC  then  continue. 

Set  RCV.NXT  to  SEG.SEQ+1,  IRS  is  set  to  SEG.SEQ  and  any  other 
control  or  text  should  be  queued  for  processing  later.  ISS 
should  be  selected  and  a  SYN  segment  sent  of  the  form: 

<SEQ=ISSXACK=RCV.  NXTXCTL=SYN ,  ACK> 

SND.NXT  is  set  to  ISS+1  and  SND.UNA  to  ISS.  The  connection 
state  should  be  changed  to  SYX-RECEIVED.  Note  that  any  other 
incoming  control  or  data  (combined  with  SYN)  w'll  be  processed 
in  the  SYN-RECEIVED  state,  but  processing  of  SYN  and  ACK  should 
not  be  repeated.  If  the  listen  was  not  fully  specified  (i.e.. 
the  foreign  socket  was  not  fully  specified),  then  the 
unspecified  fields  should  be  filled  in  now. 

fourth  other  text  or  control 

Any  other  control  or  text-bearing  segment  (not  containing  SYN) 
must  have  an  ACK  and  thus  would  be  discarded  by  ti.e  ACK 
processing.  An  incoming  RST  segment  could  not  be  valid,  since 
it  could  not  have  been  sent  in  response  to  anything  sent  by  this 
incarnation  of  the  connection.  So  you  are  unlikely  to  get  here, 
but  if  you  do,  drop  the  segment,  and  return. 

If  the  state  is  SYN-SENT  then 

first  check  the  ACK  bit 

If  the  ACK  bit  is  set 

If  SEG.ACK  «<  ISS,  or  SEG.ACK  >  SND.NXT,  send  a  reset  (unless 
the  RST  bit  is  set,  if  so  drop  the  segment  and  return) 

<SEQ=SEG .  ACKXCTL*RST> 

and  discard  the  segment.  Return. 

If  SND.UNA  =<  SEG.ACK  =<  SND.NXT  then  the  ACK  is  acceptable. 


second  check  the  RST  bit 
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If  the  RST  bit  is  set 

If  the  ACK  was  acceptable  then  signal  the  user  "error: 
connection  reset",  drop  the  segment,  enter  CLOSED  state, 
delete  TCB,  and  return..  Otherwise  (no  ACK',  drop  the  segment 
and  return. 

third  check  the  security  and  precede  ce 

If  the  secur ity/compartment  in  the  segment  does  not  exactly 
match  the  security/compartment  in  the  TCB,  send  a  reset 

If  there  is  an  ACK 

<SEQ*SEG  .A'CKXCTL*RST'> 

Otherwi se 

<SEQ*OXACK=SEG .  SEQ+SEG .  LENXCTL.=RST .  ACK> 

If  there  is  an  ACK 

The  precedence  in  the  segment  must  match  the  orecedence  in  the 
TCB,  if  not,  send  a  reset 

<SEQ=SEG .  ACKXCTL=RST> 

If  there  is  no  ACK 

If  the  precedence  in  the  segment  is  higher  than  t.he  precedence 
in  the  TCB  then  if  allowed  by  the  user  and  the  system  raise 
the  precedence  in  the  TCB  to  that  in  the  segment,  .f  not 
allowed  to  raise  the  prec  then  send  a  reset. 

<SEQ*OXACK=SEG.  SEQ+SEG.  LENXCTL  =  RST,ACK> 

If  the  precedence  in  the  segment  is  lower  than  the  precedence 
in  the  TCB  continue. 

If  ?  reset  was  sent,  discard  the  segment  and  return, 
fourth  check  th«  SYN  bit 

This  step  should  be  reached  only  if  the  ACK  is  ok,  or  there  is 
no  ACK,  and  it  the  segment  did  not  contain  a  RST, 

If  the  SYN  bit  is  on  and  the  security/compartment  and  precedence 
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are  acceptable  then,  RCV.NXT  is  set  to  SEG.SEQ+1,  IRS  is  set  to 
SEG.SEQ.  SND.UNA  should  be  advanced  to  equal  SEG.ACK  (if  there 
is  an  ACK),  and  any  segments  on  the  retransmission  queue  which 
are  thereby  acknowledged  should  be  removed. 

If  SND.UNA  >  ISS  (our  SYN  has  been  ACKed).  change  the  connection 
state  to  ESTABLISHED,  form  an  ACK  segment 

<SEQ*SNO.NXTXACK=RCV.NXTXCTL*ACK> 

and  send  it.  Data  or  controls  which  were  queued  for 
transmission  may  be  included.  If  there  are  other  controls  or 
text  in  the  segment  then  continue  processing  at  the  sixth  step 
below  where  the  URG  bit  is  checked,  otherwise  return. 

Otherwise  enter  SYN-RECEIVED,  form  a  SYN, ACK  segment 

<SEQ»ISSXACK=RCV.NXTXCTL=SYN.ACK> 

and  send  it.  If  there  are  other  controls  or  text  in  the 
segment,  queue  them  for  processing  after  the  ESTABLISHED  state 
has  been  reached,  return. 

fifth,  if  neither  of  the  SYN  or  RST  bits  is  set  then  drop  the 
segment  and  return. 
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» 

Otherwise , 

first  check  sequence  number 
SYN-RECEIVED  STATE 

ESTABLISHED  STATE  • 

FIN-WAIT-1  STATE 

FIN-WAIT-2  STATE 

CLOSE-WAIT  STATE 

CLOSING  STATE 

LAST-ACK  STATE 

TIME-WAIT  STATE  •' 


Segments  are  processed  in  sequence.  Initial  tests  on  arrival 
are  used  to  discard  old  duplicates,  but  further  processing  is 
done  in  SEG.SEQ  order.-  If  a  segment's  contents  straddle  the 
boundary  between  old  and  new,  only  the  new  parts  should  be 
processed. 

There  are  four  cases  for  the  acceptability  test  for  an  incoming 
segment: 

Segment  Receive  Test 
Length  Window 


0  G  SEG.SEQ  *  RCV.NXT 

0  >0  RCV.NXT  =<  SEG.SEQ  <  RCV. NXT+RCV.WND 

>0  0  not  acceptable 

>0  >0  RCV.NXT  *<  SEG.SEQ  <  RCV. NXT+RCV.WND 

or  RCV.NXT  »<  SEG . SEQ+SEG . LEN-1  <  RCV. NXT+RCV.WND 

If  the  RCV.WND  is  zero,  no  segments  will  c^eptable,  but 
special  allowance  should  be  made  to  accept  valid  ACKs,  URGs  and 
RSTs, 

If  an  incoming  segment  is  not  acceptable,  an  acknowledgment 

should  be  sent  in  reply  (unless  the  RST  bit  is  set,  if  so  drop  v 

the  segment  and  return):' 

<SEQ*3ND.  NXTXACK=RCV .  NXTXCTL=ACK> 

After  sending  the  acknowledgment,  drop  the  unacceptable  segment 

and  return.  « 
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In  the  following  it  is  assumed  that  the  segment  is  the  idealized 
segment  that  begins  at  RCV.NXT  and  does  not  exceed  the  window. 
One  could  tailor  actual  segments  to  fit  this  assumption  by 
trimming  off  any  portions  that  lie  outside  the  window  (including 
SYN  and  FIN),  and  only  processing  further  if  the  segment  then 
begins  at  RCV.NXT.  Segments  with  higher  begining  sequence 
numbers  may  be  held  for  later  processing. 

second  check:  the  RST  bit. 

SYN-RECEIVED  STATE 
If  the  RST  bit  is  set 

If  this  connection  was  initiated  with  a  passive  OPEN  (i.e., 
came  from  the  LISTEN  state),  then  return  this  connection  to 
LISTEN  state  and  return.  The  user  need  not  be  informed.  If 
this  connection  was  initiated  with  an  active  OPEW  (i.e.,  came 
from  SYN-SENT  state)  then  the  connection  was  refused,  signal 
the  user  "connection  refused".  In  either  cas9.  all  segments 
on  the  retransmission  queue  should  be  removed.  And  in  the 
active  OPEN  case,  enter  the  CLOSED  state  and  delete  the  TCB, 
and  return. 

ESTABLISHED 

FIN-WAIT-1 

FIN-WAIT-Z 

CLOSE-WAIT 

If  the  RST  bit  is  set  then,  any  outstanding  RECEIVES  and  SEND 
should  receive  "reset"  responses.  All  segment  queues  should  be 
flushed.  Users  should  also  receive  an  unsolicited  general 
"connection  reset"  signal.  Enter  the  CLOSED  state,  delete  the 
TCB,  and  return. 

CLOSING  STATE 
LAST-ACK  STATE 
TIME-WAIT 

If  the  RST  bit  is  set  then,  enter  the  CLOSED  state,  delete  the 
TCB,  and  return. 


t 


* 
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third  check  security  and  precedence 
SYN-RECEIVED 

If  the  security/compartment  and  precedence  in  the  segment  do  not 
exactly  match  the  security/compartment  and  precedence  in  the  TCB 
then  send  a  reset,  and  return. 

ESTABLISHED  STATE 

If  the  security/compartment  and  precedence  in  the  segment  do  not 
exactly  match  the  security/compartment  and  precedence  in  the  TCB 
then  send  a  reset,  any  outstanding  RECEIVES  and  SEND  should 
receive  "reset”  responses.  All  segment  queues  should  be 
flushed.  Users  should  also  receive  an  unsolicited  general 
"connection  reset"  signal.  Enter  the  CLOSED  state,  delete  the 
TCB,  and  return. 

Note  this  check  is  placed  following  the  sequence  check  to  prevent 
a  segment  from  an  old  connection  between  these  ports  with  a 
different  security  or  precedence  from  causing  an  abort  of  the 
current  connection. 

fourth,  check  the  SYN  bit, 

SYN-RECEIVED 
ESTABLISHED  STATE 
FIN-WAIT  STATE-1 
FIN-WAIT  STATE-2 
CLOSE-WAIT  STATE 
CLOSING  STATE 
LAST-ACK  STATE 
TIME-WAIT  STATE 

If  the  SYN  is  in  the  window  it  is  an  error,  send  a  reset,  any 
outstanding  RECEIVES  and  SEND  should  receive  "reset"  responses, 
all  segment  queues  should  be  flushed,  the  user  should  also 
receive  an  unsolicited  general  "connection  reset"  signal,  enter 
the  CLOSED  state,  delete  the  TCB,  and  return.. 

If  the  SYN  is  not  in  the  window  this  step  would  not  be  reached 
and  an  ack  would  have  been  sent  in  the  first  step  (sequence 
number  check) . 


* 


■» 
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fifth  check  the  ACK  field, 

if  the  ACK  bit  is  off  drop  the  segment  and  return 
if  the  ACK  bit  is  on 
SYN-RECEIVED  STATE 

If  SND.UNA  *<  SEG.ACK  »<  SND.NXT  then  enter  ESTABLISHED  state 
and  continue  processing. 

If  the  segment  acknowledgment  is  not  acceptable,  form  a 
reset  segment. 

<SEQ*SEG.  ACKXCTL*RST> 

and  send  it. 

ESTABLISHED  STATE 

If  SND.UNA  <  SEG.ACK  *<  SNO.NXT  then,  set  SND.UNA  <-  SEG.ACK. 
Any  segments  on  the  retransmission  queue  which  are  thereby 
entirely  acknowledged  are  removed.  Users  should  receive 
positive  acknowledgments  for  buffers  wnich  have  been  SENT  and 
fully  acknowledged  (i.e.,  SEND  buffer  should  be  returned  with 
"ok"  response).  If  the  ACK  is  a  duplicate 
(SEG.ACK  <  SND.UNA),  it  can  be  ignored.  If  the  ACK  acks 
something  not  yet  sent  (SEG.ACK  >  SND.NXT)  then  send  an  ACK, 
drop  the  segment,  and  return. 

If  SND.UNA  <  SEG.ACK  *<  SND.NXT,  the  send  window  should  be 
updated.  If  (SN0.WL1  <  SEG.SEQ  or  (SND.WL1  *  SEG.SEQ  and 
SND.WL2  «<  SEG.ACK)),  set  SND.WND  <-  SEG.WND,  set 
SND.WL1  <-  SEG.SEQ,  and  set  3ND.WL2  <-  SEG.ACK. 

Note  that  SND.WND  is  an  offset  from  SND.UNA,  that  SND.WL1 
records  the  sequence  number  of  the  last  segment  used  to  update 
SND.WND,  and  that  SND.WL2  records  the  acknowledgment  number  of 
the  last  segment  used  to  update  SND.WND.  The  check  here 
prevents  using  old  segments  to  update  the  window. 


I 


I 
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FIN-WAIT-1  STATE 

In  addition  to  the  processing  for  the  ESTABLISHED  state,  if 
our  FIN  is  now  acknowledged  then  enter  FIN-WAIT-2  and  continue 
processing  in  that  state. 

FIN-WAIT-2  STATE 

In  addition  to  the  processing  for  the  ESTABLISHED  state,  if 
the  retransmission  queue  is  empty,  the  user's  CLOSE  can  be 
acknowledged  ("ok'')  but  do  not  delete  the  TCB. 

CLOSE-WAIT  STATE 

Do  the  same  processing  as  for  the  ESTABLISHED  state. 

CLOSING  STATE 

In  addition  to  the  processing  fo;-  the  ESTABLISHED  state,  if 
the  ACK  acknowledges  our  FIN  then  enter  the  TIME-WAIT  state, 
otherwise  ignore  the  segment. 

LAST-ACK  STATE 

The  only  thing  that  can  arrive  in  this  state  is  an 
acknowledgment  of  our  FIN.  If  our  FIN  is  now  acknowledged, 
delete  the  TCB,  enter  the  CLOSED  state,  and  return. 

TIME-WAIT  STATE 

The  only  thing  that  can  arrive  in  this  state  is  a 
retransmission  of  the  remote  FIN.  Acknowledge  it,  and  restart 
the  2  MSL  timeout. 

sixth,  check  the  LIRG  bit, 

ESTABLISHED  STATE 
FIN-WAIT-1  STATE 
FIN-WAIT-2  STATE 

If  the  URG  bit  is  set,  RCV.UP  <-  max(RCV.UP.SEG.UP) ,  and  signal 
the  user  that  the  remote  side  has  urgent  data  if  the  urgent 
pointer  (RCV.UP)  is  in  advance  of  the  data  consumed.  If  the 
user  has  already  been  signaled  (or  is  still  in  the  "urgent 
mode")  for  this  continuous  sequence  of  urgent  data,  do  not 
signal,  the  user  again. 
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CLOSE-WAIT  STATE 
CLOSING  STATE 
LAST-ACK  STATE 
TIME-WAIT 

This  should  not  occur,  since  a  FIN  has  been  received  from  the 
remote  side.  Ignore  the  URG. 

seventh,  process  the  segment  text, 

ESTABLISHED  STATE 
FIN-WAIT-1  STATE 
FIN-WAIT-2  STATE 

Once  in  the  ESTABLISHED  state,  it  is  possible  to  deliver  segment 
text  to  user  RECEIVE  buffers.  Text  from  segments  can  be  moved 
into  buffers  until  either  the  buffer  is  full  or  the  segment  is 
empty.  If  the  segment  emoties  and  carries  an  PUSH  flag,  then 
the  user  is  informed,  when  the  buffer  is  returned,  that  a  PUSH 
has  been  received. 

When  the  TCP  takes  responsibility  for  delivering  the  data  to  the 
user  it  must  also  acknowledge  the  receipt  of  the  data. 

Once  the  TCP  takes  responsibility  for  the  data  it  advances 
RCV.NXT  over  the  data  accepted,  and  adjusts  RCV.bfMD  as 
apporopriate  to  the  current  buffer  availability.  The  total  of 
RCV.NXT  and  RCV.WND  should  not  be  reduced. 

Please  note  the  window  management  suggestions  in  section  3.7. 

Send  an  acknowledgment  of  the  form: 

<SEQ*SND.  NXTXACK3RCV.  NXTXCTL=ACK> 

This  acknowledgment  should  be  piggybacked  on  a  segment  being 
transmitted  if  possible  without  incurring  undue  delay. 


ft 


ft 
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CLOSE-WAIT  STATE 
CLOSING  STATE 
LAST-ACK  STATE 
TIME-WAIT  STATE 

This  should  not  occur,  since  a  FIN  has  been  received  from  the 
remote  side.  Ignore  the  segment  text. 

eighth,  check  the  FIN  bit, 

Do  not  process  the  FIN  if  the  state  is  CLOSED,  LISTEN  or  SYN-SENT 
since  the  SEG.SEQ  cannot  be  validated;  drop  the  segment  and 
return . 

If  the  FIN  bit  is  set,  signal  the  user  "connection  closing"  and 
return  any  pending  RECEIVES  with  same  message,  advance  RCV.NXT 
over  the  FIN,  and  send  an  acknowledgment  for  the  FIN.  Note  that 
FIN  implies  PUSH  for  any  segment  text  not  yet  delivered  to  the' 
user. 

SYN-RECEIVED  STATE 
ESTABLISHED  STATE 

Enter  the  CLOSE-WAIT  state, 

FIN-WAIT-1  STATE 

If  our  FIN  has  been  ACKed  (perhaps  in  this  segment),  then 
enter  TIME-WAIT,  start  the  time-wait  timer,  turn  off  the  other 
timers;  otherwise  enter  the  CLOSING  state. 

FIN-WAIT-2  STATE 

Enter  the  TIME-WAIT  state.  Start  the  time-wait  timer,  turn 
off  the  other  timers. 

CLOSE-WAIT  STATE 

Remain  in  the  CLOSE-WAIT  state. 

CLOSING  STATE 

Remain  in  the  CLOSING  state. 

LAST-ACK  STATE 

Remain  in  the  LAST-ACK  state. 
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TIME-WAIT  STATE 

Remain  in  the  TIME-WAIT  state.  Restart  the  2  MSL  time-wait 
timeout. 

and  return.  * 


»- 


- 


»' 
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USER  TIMEOUT 

For  any  state  if  the  user  timeout  expires,  flush  all  queues,  signal 
the  user  "error:  connection  aborted  due  to  user  timeout"  in  general 
and  for  any  outstanding  calls,  delete  the  TCB,  enter  the  CLOSED 
state  and  return. 

RETRANSMISSION  TIMEOUT 

For  any  state  if  the  retransmission  timeout  expires  on  a  segment  in 
the  retransmission  queue,  send  the  segment  at  the  front  of  the 
retransmission  queue  again,  reinitialize  the  retransmission  timer, 
and  return. 

TIME -WAIT  TIMEOUT 

If  the  time-wait  timeout  expires  on  a  connection  delete  the  TCB, 
enter  the  CLOSED  state  and  return. 
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1822 , 

BBN  Report  1822,  "The  Specification  of  the  Interconnection  of 
a  Host  and  an  IMP".  The  specification  of  interface  between  a 
host  and  the  ARPANET. 

i 

« 

ACK 

A  control  bit  (acknowledge)  occupying  no  sequence  space,  which 
indicates  that  the  acknowledgment  field  of  this  segment 
specifies  the  next  sequence  number  the  sender  of  this  segment 
is  expecting  to  receive,  hence  acknowledging  receipt  of  all 
previous  sequence  numbers. 

r 

1 

ARPANET  message 

The  unit  of  transmission  between  a  host  and  an  IMP  in  the 

ARPANET.  The  maximum  size  is  about  1012  octets  (8096  bits). 

ARPANET  packet 

A  unit  of  transmission  used  internally  in  ths  ARPANET  between 
IMPs.  The  maximum  size  is  about  126  octets  (1008  bits). 

r~  ■ 

is 

connection 

A  logical  communication  path  identified  by  a  pair  of  sockets.. 

r 

datagram 

A  message  sent  in  a  packet  switched  computer  communications 
network. 

g 

Destination  Address 

The  destination  address,  usually  the  network  and  host 
i dentif iers . 

i 

• 

FIN 

A  control  bit  (finis)  occupying  one  'equence  number,  which 
indicates  that  the  sender  will  send  no  more  data  or  control 
occupying  sequence  space. 

»  ' 

w 

f  ragment 

A  portion  of  a  logical  unit  of  data,  in  particular  an  internet 
fragment  is  a  portion  of  an  internet  datagram. 

* 

- 

FTP 

A  file  transfer  protocol. 

■'m 
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header 

Control  information  at  the  beginning  of  a  message,  segment, 
fragment,  packet  or  block  of  data. 

host 

A  computer.  In  particular  a  source  or  destination  of  messages 
from  the  point  of  view  of  the  communicatio  network. 

Identif ication 

An  Internet  Protocol  field.  This  identifying  value  assigned 
by  the  sender  aids  in  assembling  the  fragments  of  a  datagram. 

IMP 

The  Interface  M,issage  Processor,  the  packet  switch  of  the 
ARPANET. 

internet  address 

A  source  or  destination  address  specific  to  the  host  level., 
internet  datagram 

The  unit  of  data  exchanged  between  an  internet  module  and  the 
higher  level  protocol  together  with  the  internet  header. 

internet  fragment 

A  portion  of  the  data  of  an  internet  datagram  with  an  internet 
header. 

IP 

Internet  Protocol. 

IRS 

The  Initial  Receive  Sequence  number.  The  first  sequence 
number  used  by  the  sender  on  a  connection. 

ISN 

The  Initial  Sequence  Number.  The  first  sequence  number  used 
on  a  connection,  {either  ISS  or  IRS).  Selected  on  a  clock 
based  procedure. 

ISS 

The  Initial  Send  Sequence  number.  The  first  sequence  number 
used  by  the  sender  on  a  connection. 

leader 

Control  information  at  the  beginning  of  a  message  or  block  of 
data.  In  particular,  in  the  ARPANET,  the  control  information 
on  an  ARPANET  message  at  the  host-IMP  interface. 
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left  sequence 

This  is  the  next  sequence  number  to  be  acknowledged  by  the 
data  receiving  TCP  (or  the  lowest  currently  unacknowledged 
sequence  number)  and  is  sometimes  referred  to  as  the  left  ec^e 
of  the  send  window. 

■•ocal  packet 

The  unit  of  transmission  within  a  local  network. 

module 

An  implementation,  usually  in  software,  of  a  protocol  or  other 
procedure . 

MSL 

Maximum  Segment  Lifetime,  the  time  a  TCP  segment  can  exist  in 
the  internetwork  system.  Arbitrarily  defined  to  be  2  minutes. 

octet 

An  eight  bit  byte. 

Options 

An  Opt.on  field  may  contain  several  options,  and  each  option 
may  be  several  octets  in  length.  The  options  are  used 
primarily  in  testing  situations;  for  example,  to  carry 
timestamps.  Both  the  Internet  Protocol  and  TCP  provide  for 
options  fields. 

packet 

A  package  of  data  with  a  header  which  may  or  may  not  be 
logically  complete.  More  often  a  physical  packaging  than  a 
logical  packaging  of  data. 

port 

The  portion  of  a  socket  that  specifies  which  logical  input  or 
output  channel  of  a  process  is  associated  with  the  data. 

process 

A  program  in  execution.  A  source  or  destination  of  data  from 
the  point  of  view  of  the  TCP  ur  other  host-to-host  protocol. 

PUSH 

A  control  bit  occupying  no  sequence  space,  indicating  that 
this  segment  contains  data  that  must  be  pushed  through  to  the 
receiving  user. 

RCV.NXT 

receive  next  sequence  number 
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RCV.UP 


RCV.WND 


receive  urgent  pointer 


receive  window 


receive  next  sequence  number 

This  is  :he  next  sequence  number  the  local  TCP  is  expecting  to 
receive . 

receive  window 

This  represents  the  sequence  numbers  the  local  (receiving)  TCP 
is  willing  to  receive.  Thus,  the  local  TCP  considers  that 
segments  overlapping  the  range  RCV.NX7  to 
RCV.NXT  +  RCV.WND  -  1  carry  acceptable  data  or  control. 
Segments  containing  sequence  numbers  entirely  outside  of  this 
range  are  considered  duplicates  and  discarded. 


RST 


RTP 


A  control  bit  (reset),  occupying  no  sequence  space,  indicating 
that  the  receiver  should  delete  the  connection  without  further 
interaction.  The  receiver  can  determine,  based  on  the 
sequence  number  and  acknowledgment  fields  of  the  incoming 
segment,  whether  it  shoulH  honor  the  reset  command  or  ignore 
it.  In  nc  case  does  receipt  of  a  segment  containing  RST  give 
rise  to  a  RST  in  response. 


Real  Time  Protocol:  A  host-to-host  protocol  for  communication 
of  time  critical  information. 


SEG.ACK 

SEG.IEN 

SEG.PRC 

SEG.SEQ 

SEG.UP 


segment  acknowledgment 


segment  length 


segment  precedence  value 


segment  sequence 


segment  urgent  pointer  field 
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SEG.WND 

segment  window  field 

segment 

A  logical  unit  of  data,  in  particular  a  TCP  segment  is  the 
unit  of  data  transfered  between  a  pair  of  TCP  modules. 

segment  acknowledgment 

The  sequence  number  in  the  acknowledgment  field  of  the 
arriving  segment. 

segment  length 

The  amount  of  seouence  number  space  occupied  by  a  segment, 
including  any  controls  which  occupy  sequence  space. 

segment  sequence 

Thn  number  in  the  sequence  field  of  the  arriving  segment., 
send  sequence 

This  is  the  next  sequence  number  the  local  (sending)  TCP  will 
use  on  the  connection.  It  is  initially  selected  from  an 
initial  sequence  number  curve  (ISN)  and  is  incremented  for 
each  octet  of  data  or  sequenced  control  transmitted. 

send  window 

This  represents  the  sequence  numbers  which  the  remote 
(receiving)  TCP  is  willing  to  receive.  It  is  the  value  of  the 
window  field  specified  in  segments  from  the  remote  (data 
receiving)  TCP.  The  range  of  new  sequence  numbers  which  may 
be  emitted  by  a  TCP  lies  between  SND.NXT  and 
SND.UNA  +  SND.WNO  -  1.  (Retransmissions  of  sequence  numbers 
between  SND.UNA  and  SND.NXT  are  expected,  of  course.) 

SND.NXT 

send  sequence 

SND.UNA 

left  sequence 

SND.UP 

send  urgent  pointer 

SND.WL1 

segment  sequence  number  at  last  window  update 

SND.WL2 

segment  acknowledgment  number  at  last  window  update 


I 


P 


P 


P 


P 


w 
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SND.WND 

send  window 

socket 

An  address  which  specifically  includes  a  port  identifier,  that 
is.  the  concatenation  of  an  Internet  Address  with  a  TCP  port. 

Source  Address 

The  source  address,  usually  the  network  and  host  identifiers. 
SYN 

A  control  bit  in  the  incoming  segment,  occupying  one  sequence 
number,  used  at  the  initiation  of  a  connection,  to  indicate 
where  the  sequence  numbering  will  start. 

TCB 

Transmission  control  block,  tne  data  structure  that  records 
the  state  of  a  connection, 

TCB. PRC 

The  precedence  of  the  connection. 

TCP 

Transmission  Control  Protocol;  A  host-to-host  protocol  for 
reliable  communication  in  internetwork  environments. 

TOS 

Type  of  Service,  an  Internet  Protocol  field. 

Type  of  Service 

An  Internet  Protocol  field  which  indicates  the  type  of  service 
for  this  internet  fragment. 

URG 

A  control  bit  (urgent),  occupying  no  sequence  space,  used  to 
indicate  that  the  receiving  user  should  be  notified  tc  do 
urgent  processing  as  long  as  there  is  data  to  be  consumed  with 
sequence  numbers  less  than  the  value  indicated  in  the  urgent 
pointer. 

urgent  pointer 

A  control  field  meaningful  only  when  the  URG  bit  is  on.  This 
field  communicates  the  value  of  the  urgent  pointer  which 
indicates  the  data  octet  associated  with  the  sending  user's 
urgent  cal  1 . 
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connection  in  favor  of  the  higher  precedence  pre-empting  one.  Then  the 
THP  would  do  a  new  LISTEN  on  that  terminal’s  port  in  case  a  higher 
precedence  connection  is  attempted. 

One  of  the  reasons  for  suggesting  this  model  is  that  processes  are  the 
use.rs  of  TCP  (in  general)  and  that  TCP  itself  cannot  cause  processes  to 
be  created  on  behalf  of  an  incoming  connection  request.  Implementations 
could  be  realized  in  which  TCPs  accept  incoming  connection  requests  and. 
based  on  the  destination  port  number,  create  appropriate  server 
processes.  In  terms  of  pre-empting  access  to  a  remote  terminal, 
however,  it  seems  more  sensible  to  let  the  process  which  interfaces  the 
terminal  to  the  system  mediate  the  pre-emption.  If  the  terminal  is  not 
connected  or  is  turned  off,  there  is  no  point  in  creating  a  process  to 
serve  the  incoming  high  precedence  connection  request. 

For  exampie,  suppose  a  routine  FTP  is  in  operation  between  Host  X  and 
Host  Y.  Host  Z  decides  to  do  a  flash-override  FTP  to  Host  X.  It  opens 
a  high  precedence  connection  via  its  TCF  and  the  "SYN"  goes  out  to  the 
FTP  port  on  Host  X. 

FTP  always  leaves  one  LISTEN  pending  to  pre-empt  lower  precedence  remote 
users  if  it  cannot  serve  one  more  user  (and  still  keep  a  LISTEN 
pending).  In  this  way,  the  FTP  is  naturally  in  a  state  permitting  the 
high  precedence  connection  request  to  be  properly  served,  and  the  FTP 
can  initiate  any  cleaning  up  that  is  needed  to  deal  with  the 
pre-emption . 

In  general,  this  strategy  permits  the  processes  using  TCP  to  accommodate 
pre-emption  in  the  context  of  the  applications  they  support. 

A  non-pre-emptable  process  is  one  that  does  not  have  a  LISTEN  pending 
while  it  is  serving  one  (or  more)  users. 

The  actions  taken  to  deal  with  pre-emption  of  TCP  connections  will  be 
application-process  specific  and  this  strategy  of  a  second  (or  N+lst) 
LISTEN  is  well  suited  to  the  situation. 

Pre-emption  may  also  be  necessary  at  the  site  initiating  a  high 
precedence  connection  request.  Suppose  there  is  a  high  precedence  user 
who  wants  to  open  an  FTP  connection  request  from  Host  Z  to  Host  X,  But 
all  FTP  and/or  TCP  resources  are  saturated  when  this  user  tries  to  start 
the  user  FTP  process.  In  this  case,  the  operating  system  would  have  to 
know  about  the  precedence  of  the  user  and  would  have  to  locally  pre-empt 
resources  on  his  behalf  (e.g.,  by  logging  out  lower  precedence  users). 
This  is  a  system  issuer  not  specific  only  to  TCP,  Implementation  of 
pre-emption  at  the  source  could  vary  greatly.  Precedence  may  be 
associated  with  a  user  or  with  a  terminal.  The  TCP  implementation  may 
locally  pre-empt  resources  to  serve  high  precedence  users.  The 
operating  system  may  make  all  pre-emption  decisions. 
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PRE-EMPTION 

In  circuit-switching  systems,  once  a  user  has  acquired  a  circuit,  the 
communication  bandwidth  of  that  circuit  is  dedicated,  even  if  it  is  not 
used.  Wh« n  the  system  saturates,  additional  circuit  set-up  requests  are 
blocked.  To  allow  high  precedence  users  to  gain  access  to  circuit- 
resources.  systems  such  as  AUTOVON  associate  a  precedence  with  each 
telephone  instrument.  Those  instruments  with  high  precedence  can 
pre-empt  circuit  resources,  causing  lower  precedence  users  to  be  cut 
off. 

In  message  switching  systems  such  as  AUTODIN  I,  incoming  traffic  is 
stored  on  disks  (or  drums  or  tape)  and  processed  in  order  of 
precedence.  If  a  high  precedence  message  is  entered  into  the  system,  it 
is  processed  and  forwarded  as  quickly  as  possible.  When  the  high 
precedence  message  arrives  at  the  destination  message  switch,  it  may 
pre-empt  the  use  of  the  output  devices  on  the  switch,  interrupting  the 
printing  of  a  lower  precedence  message. 

In  packet  switching  systems,  there  is  little  or  no  storage  in  the 
transport  system  so  that  precedence  has  little  impact  on  delay  for 
processing  a  packet.  However,  when  a  packet  switching  system  reaches 
saturation,  it  rejects  offered  traffic.  Precedence  can  be  used  in 
saturated  packet  switched  systems  to  sort  traffic  queued  for  entry  into 
the  system. 

In  general,  precedence  is  a  tool  for  deciding  how  to  allocate  resources 
when  systems  are  saturated.  In  circuit  switched  systems,  the  resource 
is  circuits:  in  message  switched  systems  the  resource  is  the  message 
switch  processor;  and  in  packet  switching  the  resource  is  the  packet 
switching  system  itself. 

This  capability  can  be  realized  in  AUTODIN  II  without  adding  any  new 
mechanisms  to  TCP  (except  to  make  precedence  of  incoming  connection 
requests  vis'hle  to  the  processes  which  use  TCP).  To  allow  pre-emptive 
access  to  a  '  i,ticular  terminal,  the  software  (i.e..,  THP)  which  supports 
terminal  ac;.  ss  to  the  TAC  can  be  configured  so  as  to  always  have  a 
LISTEN  postea  for  that  terminal,  even  if  the  terminal  has  a  connection 
in  operation.  For  example  in  the  ARPANET  TENEX  systems,  the  user  TELNET 
permits  a  user  to  have  many  connections  open  at  one  time  -  the  user  can 
switch  among  them  at  will.  To  the  extent  that  this  can  be  done  without 
violating  security  requirements,  one  could  imagine  a  multi-connection 
THP  which  always  leaves  a  LISTEN  pending  for  incoming  connection 
requests.  If  a  connection  -;s  established,  the  THP  can  decide,  based  on 
its  precedence,  whether  to  pre-empt  any  existing  connection  and  to 
switch  the  user  to  the-high  precedence  one. 

If  the  user  is  working  with  several  connections  of  different  precedence 
at  the  same  time,  the  THP  would  close  or  abort  the  lowest  precedence 
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